JOURNAL OF 


REGIONAL SCIENCE 








Vol. 2 Spring 1960 





Samuel A. Stouffer 

INTERVENING OPPORTUNITIES 

AND COMPETING MIGRANTS 
Richard E. Quandt 

MODELS OF TRANSPORTATION AND 

OPTIMAL NETWORK CONSTRUCTION 
William Warntz and David Neft 

CONTRIBUTIONS .0 A STATISTICAL 

METHODOLOGY FOR AREAL DISTRIBUTIONS 
Walter Isard and David J. Ostroff 

GENERAL INTERREGIONAL EQUILIBRIUM 
Charles M. Tiebout 

COMMUNITY INCOME MULTIPLIERS: 

A POPULATION GROWTH MODEL 





PUBLISHED BY THE REGIONAL SCIENCE RESEARCH INSTITUTE 
in cooperation with the 
Department of Regional Science of the Wharton School 


University of Pennsylvania 





Spring 1960 


JOURNAL OF REGIONAL SCIENCE 


Edited by W. Isard and M, B. Teitz 


Managing Editor, B. H. Stevens 





The Journal of Regional Science is published in two or more issues per 
volume by the Regional Science Research Institute, G.P.0. Box 8776, 
Philadelphia 1, Pa., in cooperation with the Department of Regional 
Science, Wharton School, University of Pennsylvania. 


The subscription rate for libraries, business and other organizations, 
and individuals is $5.00 per volume. All orders should be addressed to 
the Regional Science Research Institute and not to the Wharton School. 
Individuals only may receive a subscription by joining the Regional 
Science Association and paying an extra charge for the Journal. Total 
yearly rate for subscription to the Journal and membership in the Asso- 
ciation and paying an extra charge for the Journal. Total yearly rate 
for subscription to the Journal and membership in the Association is 
currently $3.00. For further information about membership in the Associ- 
ation write to Regional Science Association, Wharton School, University 
of Pennsylvania, Philadelphia 4, Pa. 


Back numbers and single issues may be obtained from the Institute while 
they are still in stock at a price of $2.50 per copy. 


Manuscripts are invited and with other communications for the editors 
should be addressed to the Institute. 





The Regional Science Research Institute is a non-profit corporation 
devoted to research and studies on the structure, function, and opera- 
tion of regions from an economic, social, and political standpoint. 








JOURNAL OF REGIONAL SCIENCE, VOL. 2, NO. 1, 


INTERVENING OPPORTUNITIES 
AND COMPETING MIGRANTS 


by Samuel A. Stouffer* 


1. INTRODUCTION 


The aim of this paper is to make a contribution toward theory in 
human ecology and sociology. In 1940 the writer [13] introduced the 
concept of intervening opportunities toprovide a simple model accounting 
for much of the observed movement of population in space. The idea is 
that the number of people going a given distances from a point is not a 
function of distance directly but rather a function of the spatial dis- 
tribution of opportunities. More specifically, it was postulated that 


the number of people going s distance from a point is directly propor- 
tional to the number of opportunities on the perimeter of a circle with 
radius s and inversely proportional to the number of opportunities on 
or within that circle. After making operational definitions of “oppor- 
tunities,’ it was possible to demonstrate empirically that the model 
provided a quite promising description of the actual residential mobility 
between census tracts in Te ee Cleveland. Subsequent studies by 


Bright and Thomas [2], Isbell [8], Strodtbeck [15, 16], and others have 
applied the model to other populations, in America and abroad, with con- 
siderable success. Recently an interesting use of the concept of inter- 
vening opportunities in the geographical interpretation of commodity 
flow has been proposed by Ullman [17]. 


The writer pointed out in his original paper that the model as 
presented was inadequate in handling marked directional drifts where 
the uneven distribution of opportunities within the circle might facil- 
itate greater movement in one direction from the starting point than in 
an opposite direction. Nor are models involving distance, alone, of any 
help in such a case. Equations such as proposed by Zipf [18], Stewart 
(10, 11, 12], Dodd [3] and others take no account of the distribution 
of intervening populations. A discussion by Anderson [1], and Ikle [6], 
which appeared while the present paper was in preparation, emphasized 
the need for further study. 


Consider migration from St. Louis to New York, Denver, and los 
Angeles, respectively. Between St. Louis and New York are most of the 
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large population centers of the country. Between St. Louis and Denver 
or between St. Louis and Los Angeles are few population centers. We 
seek a model which will take that fact into account. The concept of 
intervening opportunities, in the form originally presented by the 
writer, does not. New York is only slightly farther from St. Louis 
than Denver. Hence a circle with center at St. Louis will show almost 
as many intervening opportunities between St. Louis and Denver as be- 
tween St. Louis and New York. And it will show even more between St. 
Louis and lws Angeles. 


2. A REDEFINITION OF INTERVENING OPPORTUNITIES 


We first propose a redefinition of intervening opportunities as 
follows: 


1) Connect any two cities with a straight line. 
2) Draw a circle with this line as a diameter. 


3) Count the opportunities on or within this circle. This circle, 
shown as Circle B in Figure 1, can be contrasted with Circle A which has 
St. Louis as itscenter and the distance St. Louis-Denver as its radius. 


The construction of B as a circle is arbitrary; it is selected for 
its simplicity. An ellipse might turn out to be more appropriate. Or 
even a pie-shaped wedge with apex at St. Louis and angle whose optimum 
magnitude could only be determined empirically. 

















FIGURE 1 





lin constructing circles such as B for any two cities for the present 
paper, the writer arbitrarily extended all diameters to a point approx- 
imately 75 miles on the map beyond each of the two cities, to allow for 
the influence of large population centers which otherwise might be just 
outside the range and which it seemed imprudent to ignore. 
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Now let us examine some empirical data. From the 1940 Census of 
Population, in the volume, INTERNAL MIGRATION 1935 to 1940, COLOR 
AND SEX OF MIGRANTS, Table 16, we find the reported number of mi- 
grants from each city of 100,000 or over to eagh other such city. 
St. Louis to Los Angeles, Denver, and New York“ the figures are: 

Lae Angelets ccoccccesecceccedy mee 
UE caccvccsoncccccesesseesSe 
New York. ccccccccccccescccee lh, aor 


The total migrants from St. Louis to all cities of 100,000 or over are 
24.0 thousand. 


From 


We define opportunities in a given city astotal number of migrants 
to that city from all other cities of 100,000 or over, except from that 
city’s suburban satellites.” These data are totalled inthecensus vol- 
ume above referred to. From the same source we have also the data on 
the number of migrants leaving a given city for all other cities of 
100,000 population or over (less those going to the city’s suburban 
satellites). 


We now draw three Circles A around St. Louis as a center, with 
radii Los Angeles, Denver, and New York respectively, and count the 
total number of opportunities in all cities lying approximately on or 
within each circle. Likewise, we draw three Circles B as described 














above and count the total numbers of opportunities therein. We obtain 
the following data in Table 1: 
TABLE 1 
Total Xp X4 Xp Y 
Migrants |Distance | Intervening |Intervening| Migrants 
to from Opportunities |Opportunity from 
given city |St.Louis*| (Circles A) | (Circles B)|St.Louis to 
(thousands)} (miles) | (thousands) | (thousands)! given city 
Los Angeles| 139.4 1901 743 288 =| = 3945 
Denver 11.6 878 456 58 462 
New York 83.7 965 547 444 | 1279 




















2all cities of 100,000 or over lying within the metropolitan area of a 
major center are treated as part of the central city in this paper. More- 
over, cities like Minneapolis-St. Paul, or San Francisco-Oakland are com- 
bined and treated as one. 


Sat first glance, our definition of opportunities may seem to involve 
“circularity.’’ It does not. There is even less “circularity” in its use 
in the present paper than would be the use of marginals in a contingency 
table in determining association among the individual internal cel 


s of 
the table. 


4Highway driving distances, 


ROAD ATLAS, 1953. 


as given 


in RAND McNALLY REFERENCE AND 
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If we let Xy equal the product of all migrants from St. Louis and 


of all migrants to a given city we have for Xy, in thousands: 


Los Angeles® 3, 345.6 
Denver 278.4 
New York 2,008.8 


Consider now the intervening opportunities model as originally 
stated, Stouffer [13]: 





Y= , where b = 1] (1) 


Let us allowbto be determined empirically. In logarithms we can write 
log Y - log Xy = log a - b log X, (2) 
which is a straight line with a slope of b. 


Our new intervening opportunities model substituting Xp for X,, 


for which we make no a priori postulate about b, becomes the straight 
line 


Log Y - log X, = log a - b log X, (3) 


The model based on distance irrespective of intervening opportuni - 
ties can be written as a straight line. 


log Y - log Xy = log a - b log Xp (4) 
The logarithms® are in Table 2. 


TABLE 2 








(log Y - log Xy) log X, log Xp log Xp 





To Los Angeles 1.08 2.87 2.46 3.28 
To Denver 1L2 2.66 1.76 2.94 
To New York .80 2.74 2.65 2.98 




















For example, 24.0 x 139.4= 3345.6. Clearly, both factors must be taken 
into account, The product, instead of some other relationship, is used 
because of simplicity, especially in view of subsequent operations. 


6 — ‘ , 
To keep these values positive, for convenience, we express X, in terms 


of tens of thousands. Thus for Los Angeles log X,, = log 334.56 = 2.52; 
log Y = 3.60; log Y - log Xy = 1.08. u 








MIGRATION AND INTERVENING OPPORTUNITIES 


STOUFFER: 





*stno] "3S oL pue wot J uoTze13 TW jo S[T9poy se14] jo peieduc) AVL] TqGe39 Tpeig :f AWD 














8x Bo; 
o¢ $2 SI 
t T 2 
An? 
° 
v7 4o1 
dg. 
(3) 
8y Bo) 
o's $2 $'1 
i Tt 1 
-- Ss 
an® 
4o'1 
wie 
(@) 7300W saad 
S$ ZILINNLYOddO (9) 
ONIN 3AUSINI 


WYOA M3N-AN ‘S3739NV SOT -V7 














Gx Bo; 
s¢ oe Sz 
t T “"— 
ANO 
° 
v7 " : 
Qa 
si 
(2) 
Sy Bo; 
s¢ ore Sz 
' | 1 
an® 
— oO'l 
v1® 
o® 
7300W “sl 
JONVISIC 
(q) 














“Y3AN30- <0 
Yx Bo) 
oe s2z o7~ 
T 1 
a he 
4 
Ww 
° 4 
v1 40" : 
of - 
ie) 
c 
Ww 
~¢s I 
Vx Bo) 
oe G2 o'2 
f qT | 
33° 
- 
) 
°o 
an® = 
~~ 
‘ 40°! 4 
vi 5 
a® c 
” 
(vy) 1300W = $3 
S 3ILINNLYOddO xeOl 
ONINSAMSLNI =A PP! 





(P) 


(e) 








6 JOURNAL OF REGIONAL SCIENCE, VOL. 2, NO. 1, 1960 





The data are plotted on the top row of Figure 2. Chart 2(a) is 
based on the concept of intervening opportunities as originally pre- 
sented and the picture is hopelessly bad, since New York falls far 
short of what would be expected if Denver and Los Angeles fit the model. 
Chart 2(b) is based on the distance model. It is no better. On the 
other hand, chart 2(c), basedon the new intervening opportunities idea, 
alone of the three charts puts the points in correct rank order, al- 
though not in a perfect straight line. 


Now a model has value only if it has some generality. Let us see 
how the three concepts order the data when the reverse migration move- 
ments are considered - to St. Louis. Here the ecology is very differ- 
ent. Our Circles A are now drawn around Los Angeles, Denver, and New 
York respectively with circumferences passing through St. Louis. There 
is sparse population within the Los Angeles and Denver circles as com- 
pared with the New York circle. The population within Circles B will 
differ only slightly from that in the previous Circles B.! 


The observed data (Table 3) are as follows: 











TABLE 3 
Total Xp X4 Xp Y 
migrants |Distance| Intervening | Intervening |Migrants to 
from to Opportunities| Opportunities} St. Louis 
given city|St.Louis| (Circle A) (Circle B) from 
(thousands) | (miles) | (thousands) | (thousands) | given city 
Los Angeles 47.8 1901 229 160 452 
Denver 14.7 878 99 59 197 
New York 106.7 965 405 372 702 




















Total migrants to St. Louis from all cities of 100,000 or over are 11.8 
thousand, whence values of Xy) the product of all migrants from Los 


Angeles times all migrants to St. Louis, etc., are: 


Los Angeles 564.0 
Denver pee 
New York 1259.1 





Tactually, the only difference is illustrated by the 
intervenin 
within it th 


e rocedure that the 
population in the St. Louis to Los Angeles case includes 
e total number of migrants toLos Angeles, though, of course, 


excluding migrants from St. Louis itself. While the intervening popula- 
tion Los Angeles to St. Louis includes the total number of migrants to 
St. Louis, excluding migrants from Los Angeles. 
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In Table 4 the logarithms are: 
TABLE 4 


(log Y - log Xy) log X, log Xp log Xp 











From Los Angeles 91 2.36 3.28 2.20 
From Denver 1.05 2.00 2.94 1.77 














From New York ote 2.61 2.98 PY 





These data are plotted in the lower half of Figure 2. In this 
case, both of the intervening opportunities models order the data in 
correct rank order and also with rather satisfying linearity. The dis- 
tance model fails again. 


At this point we see that only the new intervening opportunities 
model 2(f) has correctly ordered both sets of data, involving migra- 
tion both from and to St. Louis. But if the new model is to have the 
kind of generality we seek, these two sets of three points should not 
only each lie approximately on straight lines, but also should lie on 
the same straight line. 


In Figure 3(a) we superimpose on the same graph the data charted 
on Figures 2(c) and 2(f). Consistently, the migration from St. Louis 
is somewhat overestimated as compared to the migration to St. Louis. 


What is happening? What plausible modifications of our model 
might take it into account? 


3. INTRODUCING THE CONCEPT OF COMPETING MIGRANTS 


We seek some factor which is asymmetrical as between two cities-- 
sheer distance, and, to a large extent, our new concept of inter- 
vening opportunities are not. In the present instance, it should be a 
factor which makes opportunities in Los Angeles or Denver relatively 
more attractive to St. Louis migrants than the reverse. 


One such factor could be what we shall call competing migrants, 
Consider the map. Migrants from almost any city in the United States 
are closer to St. Louis initially than are migrants from Los Angeles 
to St. Louis. On the other hand, St. Louis emigrants are closer to Los 
Angeles than are the majority of big city emigrants in America. 


Is it not plausible to suggest that, everything else being equal, 
the attractiveness of City Y for migrants from City X will depend, at 
least to some extent, on how many potential migrants are closer to Y 
than are the potential migrants in X? 


Let us define the number of competing migrants as X, the total 


number of persons leaving cities as close or closer to Y than the mi- 
grants in X. We draw two Circles C around Denver and St. Louis, re- 
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5t 
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FIGURE 3: Migration To and From St. Louis Superimposed on the Same 
Chart. 
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FIGURE 4 


spectively, in Figure 4, with radii St. Louis - Denver. From the same 
census source we find that the total number of emigrants from cities of 
100,000 or over within the circle centered at Denver is 160,000, the 
number of emigrants from cities within the circle centered at St. Louis 
is 512,000. Similarly, we compare two Circles C with radius St. Louis- 
Los Angeles and two Circles C with radius St. Louis-New York. We have 
in Table 5: 











TABLE 5 
Thousands of Migrants Thousands of Migrants 
Competing for Competing for St. Louis 
Los Angeles 259 782 
Denver 160 512 
New York 452 625 














In each case the competition is greater when movement is in the di- 
rection of St. Louis than when movement is in the reverse direction. 


Let us now try a new model, involving both our revised concept of 
intervening opportunities X, and our new concept of competing migrants 


which we will call X¢, and for simplicity we treat Xp and X¢ as multi- 


plicative: 
K 
Y on (5) 
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This, if a good fit, should order all six points (log Y - log Xy) on 


a negatively sloped linear function of (log Xp + log X,). We have in 
Table 6: 














TABLE 6 
(log Y - log Xy) | log Xp|log X.| (log Xp + log X,) 

From St. Louis 

To Los Angeles 1.08 2.46 | 2.41 4.87 

To Denver L.Ze 1.76 | 2.20 3.96 

To New York .80 2.65 | 2.66 5.31 
To St. Louis 

From Los Angeles 91 2.20 | 2.89 5.09 

From Denver 1.05 ee ee ® 4.48 

From New York 2 2.37 |. 20 Dome 

















The data are plotted in Figure 3(b). There can be little doubt that 
the fit is improved and most, though not quite all of the systematic 
discrepancy in the upper graph is now removed. 


Because of its simplicity this model seems preferable, at least 
initially, to one which logically might be even more attractive in that 
it would attempt directly to weight each intervening city’s contribution 
to a cumulated intervening opportunity total by a figure proportional 
to that city’s competing migrants (or some power of this figure). At 
present this operation would seem prohibitively laborious, but it might 
become feasible if programmed for one of the big computers. 


4. A WIDER TEST OF THE MODEL INVOLVING BOTH INTERVENING 
OPPORTUNITIES AND COMPETING MIGRANTS 


The construct introduced here could be effective, in the illustra- 
tion given, merely because it accidentally corresponded to some other 
factors peculiar to St. Louis in relation to the other cities. If the 
model is to have generality, it must hold for other cities as well, and 


both X, and X;. must be needed to do the job. 


Therefore, we shall now study the migration to and from Los Angeles, 
Denver, Chicago, and New York from each of the 16 American cities with 
more than 500,000 population in 1940.8 This yields 116 inter-city ob- 
servations in all. The basic data, in logarithums, are shown in the 
first five columns of Table 12, (See Appendix) 





8We shall continue to treat Minneapolis-St. Paul, San Francisco-Oakland, 
and Kansas City, Mo.,-Kansas City, Kansas as single cities. All cities 
of 100,000 or over in the metropolitan area of alarge central city (e.g. 
Camden, N. J., in relation to Philadelphia) are treated as part of the 
central city. 
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We now fit to these data by least squares the following equation: 


log Y = log K + A log Xy + B log Xp + C log X. (6) 


We introduce the separate coefficients BandC in order to compare their 
relative magnitudes, and we introduce the coefficient A to allow for 
the possibility that sheer numbers of immigrants to and emigrants from 
a given city may have an exponential rather than linear effect on the 
movement between that city and another. 


For purposes of comparison, we separately test the postulates of 
Zipf, Stewart, and others, involving sheer distance. For this test 
we also give Xy the freedom to vary exponentially and leave the co- 


efficient of Xp, distance, to be determined empirically. We have: 
log Y = log K’ + A’ log Xy + D log Xp (7) 


By multiple regression analysis, the following least square constants 
(Table 7) are determined: 











TABLE 7 
Intervening Opportunities 
and Distance 
Competing Migrants model mode | 
log K= 2.5237 log K’ = 2.3902 

A= 1.2509 A’ = 1.2047 

B=  -.4195 D = -.6157 

C= -.4238 














The values of log Y‘wBc and of log Yup predicted from these two 


equations, respectively, are shown in the last two columns of Table 12. 


It is interesting to note that the coefficients B and C are nearly 
alike -- -.4195 and -.4238, respectively. 


Also it is interesting to note that, in these cities, both A and 
A’ are somewhat greater than unity, being 1.2509 and 1.2047 for the two 
models, respectively. This says that the attractive power of one city 
for another, either when intervening opportunities and competing mi- 
grants are held constant, or when distance is held constant, is best 
measured by raising X,, the product of total in-migrants and total out- 


migrants, to the 1.2509 or 1.2047 power, respectively. This means that 
great centers like New York, Chicago, orLos Angeles draw from or export 





Results using the original intervening opportunities model Aare omitted 
in the interest of brevity. Although fitting better, on the average, than 
the distance model, it possesses many of the same systematic errors. 
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to other large centers exponentially more people than to or from smaller 
centers, or than do two small centers to or from each other. Analogous 
findings should be watched for in all further studies of inter-city 
migration. 


The multiple correlation coefficients (R) and standard errors of 
estimate (S) are as follows in Table 8: 











TABLE 8 
Intervening Opportunities 
and Distance 
Competing Migrants model model 
R . 9761 - 9332 
S .1414 . 2299 

















While multiple correlations as high as these are unusual in socio- 
logical studies and gratifying, one must keep in mind the fact that 
there is a very wide range in the values of log Y. This can be seen 
by the fact that observed migrants varied all the way from 29 from 
Denver to Buffalo to 18,942 from New York to Los Angeles. The pre- 
dicted value of Y could correlate very highly with the observed values 
and still differ by a substantial percentage error. Ilence, in compar- 
ing the two models, the respective values of the standard errors are 
more informative in some respects than the correlation coefficients. 
As shown above, S, the standard error of estimate from the regression 
plane, is much larger for the distance model. The significance of this 
can be better appreciated if we look in Table 9 at S in terms of its 
antilogarithms, which represent the ratio of the observed to expected Y: 











TABLE 9 
Intervening Opportunities 
and Distance 
Competing Migrants model mode | 
\\pper band of standard error 1.38 1.70 
Lower band of standard error 72 . 59 











About a third of the time the observed migration is more than 38 
per cent higher or more than 28 per cent lower than predicted from our 
intervening opportunities and competing migrants model. Unless these 
errors are systematic--within one type of region, for example, or for 
movement in one general direction--this means that there are special 
local factors operating not covered in the theory. This we should ex- 
pect. For example, the largest percentage error for any two cities was 
in the case of Washington, D. C., to Denver. One would not expect the 
movement to and from the nation’s capital to follow closely either gen- 
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eral pattern, except by accident. We observe 25] migrants from Wash- 
ington to Denver but predict only 93. The discrepancy, which could be 
due to the transfer of an entire government office to Denver, is numer- 
ically small but very large in percentage. 


The distance model, on the average, is far less satisfactory. In 
about a third of the cases the error exceeds 70 per cent, on the posi- 
tive side, or 4] per cent, on the negative side. 


Of course, much of the observed correlation in the case of each 
model is attributable to X,. Let us, therefore, hold X, constant and 


look at the residual correlations (see Table 10). 











TABLE 10 

Intervening 
Opportunities Distance 
and Competing mode] 


Migrants model 





Partial correlation of 


Xp, with Y, holding X, - .99 


and Xe constant 





Partial correlation of 


X- with Y, holding Xy - .ol 


and Xp constant 











Combined partial | Partial 
correlation of Xp, and X;. correlation - .72 
with Y, holding Xy 91 | of X,) with Y 
constant! ? | holding Xy constant 





These findings are interesting for two reasons:. (1) They show 
that, for the 116 inter-city migrations here studied, the impact of 
competing migrants is approximately equally as effective as the im- 


pact of intervening opportunities. The partial correlations of - .57 
and - .59 respectively are each significant far beyond the .0] level. 
1lOmis measure may be unfamiliar to some readers. It is obtained by the 
formula 
* 
Ry 2 f- a - re C= R )) 
YeXpXeXy YeXpeXy YXe- XpXy 


which was introduced by the writer in 1936. Stouffer (14). As a multiple 
correlation, it cannot, of course, be assigned a minus sign. 
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And this complementarity occurs in spite of the fact that the overall 
zero-order correlation between the measures of intervening opportuni- 
ties and competing migrants is fairly high, namely + .74. (2) When X, 


is held constant, the combined partial correlation of the two measures 
with the residuals of Y is a quite satisfactory .91, a measure which is 
substantially better than the comparable measure of .72 for distance 
alone. It can be shown further that the standard error of estimate of 
[log Y - fy (log X,)] from [log Xp - fp (log X,)] is about two-thirds 


larger than the comparable standard error from the model using inter- 
vening opportunities and competing migrants. 


Comparison of any two models should not rest alone on the overall 
correlational comparisons. It is actually possible for one model to 
yield a higher correlation with observed data than another, on the 
average, yet still be inferior. 


Why? Because, the errors in the one model could be highly sys- 
tematic, while the errors in the other, though greater on the average, 
could be more or less at random. In developing theory, we are seeking 
maximum generality, and a model is unsatisfactory if it contains large 
systematic errors, that is, if it consistently overpredicts, or con- 
sistently underpredicts, for a substantial block of cases which are 
similar in their characteristics. For example, inland cities as a 
whole, or cities on either seaboard as a whole, or for one direction 
of migration as compared with another. 


Let us now examine the two models for evidence of such systematic 
errors. Table 13, giving the anti-logs of the last three columns in 
Table 12 (see Appendix for Tables 12 and13), presents the observed and 
predicted values for all observations, arranged to enable easy obser- 
vation of systematic error if present. 


Study of Table13 makes it very clear that the type of problem 
illustrated with the special case of St. Louis in the first part of 
this paper, is no isolated phenomenon. The intervening opportunities 
and competing migrants concepts, taken together, do tend to erase large 
systematic discrepancies in the distance model, many of which are more 
drastic than the St. Louis illustration. Both models are sadly inade- 
quate for some special cases, but our new measures order the data much 
better than the distance model for migration to and from Los Angeles, 
Denver, and Chicago, and at least as well for migration to and from New 


York. 


One way of summarizing Table 13 is to compare the relative per- 
formance of the two models in predicting migration: 
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TABLE 11 

| Intervening Opportunities | Same | Distance | Total 

and Better 

Competing Migrants Better | 

iH] 
To Los Angeles 14 2 0 | 16 
From Los Angeles ll - 5 | 16 
To Denver 15 . 1 || 6 
From Denver 13 - 3 16 
To Chicago | 13 - 3 16 
From Chicago 13 - 3 16 
To New York 9 - 7 16 
From New York 11 2 3 16 
Total, less duplications!! 89 4 23 116 

















The direction of systematic bias in the distance model and the 
extent to which it has been corrected can be seen graphically in Figures 
5 and 6, where totals for cities grouped regionally are charted. For 
example: 


The migration to Los Angeles, Denver, and Chicago from 
cities west of Chicago is systematically underpredicted by the 
distance model. The same is true of the reverse migration. 


The migration from inland cities to New York is system- 
atically overpredicted by the distance model, as is the migra- 
tion from New York to these same cities. 


In almost all such cases graphed in Figures 5 and 6, the utiliza- 
tion of measures of intervening opportunities and competing migrants 
narrows the gap, with the resulting errors now in one direction now in 
another. 


Both models are fairly free of systematic error for coast-to-coast 
movement, except inthe case of movement involving New York City. There 
is a large excess of migrants from New York City to Los Angeles and San 
Francisco, reminiscent of the amusing “New Yorker’s map of New York” on 
which Hollywood is a western suburb. Likewise there is some excess of 
reverse migration from the West Coast to New York. 





11For the reader’s convenience, data of the type Los Angeles-Denver ap- 

ear twice in Tablel2 and Table1l3 (under both Los Angeles and Denver. ) 
fn computing the equations, of course, such data counted only as one ob- 
servation. 
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FIGURE 5: In-Migrants, Observed & Predicted, To Los Angeles, Denver, 


Chicago and New York. 
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AND COMPETING MIGRANTS MODEL 


DO - PREDICTED MIGRANTS FROM DISTANCE MODEL 


FIGURE 6: Out-Migrants, Observed and Predicted, From Los Angeles, 
Denver, Chicago and New York. 
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There are a few other special cases of discrepancies. Both models 
underestimate the short-distance migration to New York from Baltimore, 
Washington, Philadelphia and Boston and overestimate the migration in 
the reverse direction. Both models, (though the distance model is much 
worse) underestimate the migrants from Los Angeles to San Francisco and 
Denver. Reference to Table 13 will show that Washington, D. C. behaves 
erratically with respect to Denver and New York City. 


Nobody who contemplates the multiplicity of economic, political, 
social, and psychological factors which must enter into the personal 
contemplation of any prospective migrant would expect any simple model, 
using only two or three variables, to account for everything. The fact 
that the two concepts of intervening opportunities and competing migrants 
order so much of the phenomena so well may indeed be surprising to some 
readers, especially to psychologists who may be predisposed to look first 
at an individual’s motives and only secondarily, if at all, at the mas- 
sive framework of ecological structure. 


5. CONCLUSION 


It is the writer’s hope that this present study, like his previous 
one in which the concept of intervening opportunities was introduced, 
will stimulate further research. It may very well turn out that the 
notions of intervening opportunities or competing migrants as here de- 
veloped, are imperfect reflections of some other more effective concepts 
yet tobe discovered. Moreover, the writer would be the last to suggest 
that the measurements used in this paper are the best possible. They 
are crude and arbitrary. Improvements in measurement may improve par- 
ticular predictions. 


Especially, it should be noted that neither the distance model nor 
the model proposed in this paper requires necessarily a measurement of 
distance in terms of simple miles. A more sophisticated approach, in 
either case, might be to measure distance not in miles but in terms of 
“economic distance,” based on transport costs. This is a challenging 
problem for future students. See discussion in papers by Harris [5], 
Dunn [4], and Nelson [9], and in Chapter 11 in the forthcoming book by 
Isard et al, [7]. 


In summary, this analysis contributes to migration theory by dem- 
onstrating the shortcomings of a mere mechanical use of physical dis- 
tance and by demonstrating rather dramatically the advantages of a 
better model in the reduction both of average and systematic error. 
Whether or not the concepts of intervening opportunities and competing 
migrants long remain in their present formulation, they should stimulate 


12m%he intervening opportunities and competing migrants concepts are es- 
pecially cumbersome in application to two major centers with few or no 
cities intervening. For example, in defining intervening opportunities 
to include those in the target city as well as in those in between, we 
avoid the absurdity of possibly predicting an infinite number of migrants; 
but whatever result we get may be due in part to a variety of partly com- 
pensating errors. Similarly, in defining competing migrants to include 
those in the city of emigration. 
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theoretical and empirical analysis which will lead to further improve- 
ment in the theory of migration. And they should, of course, stimulate 
the exploration, testing, and modification of still broader theoretical 
conceptions of which those related to geographical mobility may be only 
a special case, such, for example, as are involved in Zipf’s “Principle 
of Least Effort,’’ Stewart’s “Demographic Gravitation,” or Dodd’s“ Theory 
of Dimensional Analysis.” 
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APPENDIX 


TABLE 12: Data l'sed in Predicting Intercity Migration; Comparison of Ob- 
served Migration (Log Y) with that Predicted by Intervening Opportunities 
and Competing Migrants (Log Yupo) and that Predicted by Distance (Log Yyp) 














log X, | log Xz | log X,. | log Xp | log Y | log Yurc | Log Yup 
| 
TO LOS ANGELES 
from: 
San Francisco 2.60 2.14 1.67 2.60 4.08 4.17 | 3.92 
Denver re | y Ree 1.99 3.08 3.70 3.63 | 3.28 
Kansas City 2.58 2.39 2 ae 3.21 3.86 HY 3.52 
Minn. -St. Paul 2.48 2.40 2.46 3.30 » eg | 3.58 3.20 
St. Louis 2.52 2.46 2.41 3.28 3.60 3.62 | 3.41 
Chicago 3.05 | 2.53 | 2.56 | 3.33 | 4.21] 4.19 | 4.01 
Ma lwaukee 2.20 2.38 2.62 3.34 3.15 3.09 2.98 
Detroit 2.61 2.68 2.03 3.38 3.63 3.52 3.45 
Cleveland 2.58 mitz 2.74 3.39 3.47 3.45 3.41 
Pittsburgh 2.3% 2.74 eG 5 3.41 3.19 3.15 RE 
Buffalo 2.20 ate 2.79 3.42 2.94 2.95 2.93 
Baltimore 2.30 2.83 2.81 3.44 2.80 3.02 3.04 
Washington 2.48 2.82 2.80 3.44 3.20 3.26 3.26 
Philadelphia 2.62 2.88 2.87 3.46 3.29 3.38 3.42 
Poston 2.48 2.92 2.93 3.49 3k 3.16 3.23 
New York 3.18 2.84 2.89 3.46 4.28 4.09 4.09 
FROM LOS ANGELES 
to: 
San Francisco 2.22 1.54 1.84 2.60 4.07 3.87 3.46 
Denver 1.74 1.60 2.56 3.08 3.03 2.94 2.29 
Kansas City 1.85 2.09 2.90 pA | 2.87 ey 2.64 
Minn. -St. Paul 1.84 2.09 2.92 3.30 2.80 2.41 2.58 
St. Louis 1.75 2.20 2.89 3.28 2.66 2.57 2.48 
Chicago 2.41 2.42 2.89 3.33 3.39 3.30 3.24 
Milwaukee 1.60 2.39 2.92 3.34 BY 2.29 2.26 
Detroit 2.27 2.58 2.9% 3.38 3.00 3.05 3.04 
Cleveland 1.86 2.60 2.90 3.39 2.50 y 2.54 
Pittsburgh 1.57 2.62 2.91 3.41 » He 2.16 2.18 
Buffalo 1.65 2.60 2.91 3.42 2.10 2.26 zeae 
Baltimore 1.91 2.74 2.90 3.44 2.36 2.20 2.50 
Washington Be » 2.90 3.44 3.07 3.05 3.07 
Philadelphia 1.99 2.80 2.90 3.46 2.81 2.61 2.66 
Boston 1.81 2.85 2.90 3.49 i 2.36 2.42 
New York 2.60 2.81 2.88 3.46 3.60 3.38 3.39 
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TABLE 12 (continued) 
log Xy | log X, | log X--| log Xp | log Y log Yupc log Yup 
TO DENVER 
from: 
San Francisco 1.53 2.25 2.63 3.11 2.59 2.38 2.32 
Los Angeles 1.74 1.60 2.56 3.08 3.03 2.94 2.59 
Kansas City 1.51 1.56 1.68 2.79 3.13 3.05 2.49 
Minn.-St. Paul | 1.40 1.60 2.11 2.93 2.56 2.73 2.27 
St. Louis 1.44 1.76 1.20 2.94 2.66 2.65 2.31 
Chicago 1.96 | 1.99 | 2.33 | 3.01 | 3.09] 3.15 2.90 
Mi lwaukee 1.11 2.15 2.37 3.02 2.10 2.01 1.87 
Detroit 1.53 2.35 2.70 3.12 2.33 2.31 2.31 
Cleveland 1.5] 2.49 2.74 3.14 2.10 2.21 2.28 
Pittsburgh 1.28 2.53 2.78 3.15 1.94 1.89 1.99 
Puffalo 1.11 2.54 2.80 3.19 1.67 1.66 1.76 
Bal timore 1.2 2.68 2.82 3.21 1.64 1.74 1.90 
Washington 1.40 2.63 2.83 3.23 2.40 1.97 2.09 
Philadelphia 1.54 2.75 2.85 3.24 2.06 2.09 2.25 
Boston 1.40 2.78 2.95 3.28 1.92 1.86 2.04 
New York 2.08 2.69 2.90 3.25 2.84 2.77 2.90 
FROM DENVER 
to: 
San Francisco 1.71 2.31 2.10 3.11 3.18 2.80 2.54 
Los Angeles 2.31 2.23 1.99 3.08 3.70 3.63 3.28 
Kansas City 1.32 1.59 2.45 2.79 2.76 2.47 2.26 
Minn.-St. Paul | 1.32 1.63 2.55 2.93 2.26 2.41 2.18 
St. Louis 1.24 1.77 2.71 2.94 2.29 2.18 2.07 
Chicago 1.89 2.15 2.81 3.01 2.85 2.80 2.81 
Milwaukee 1.08 2.14 2.81 3.02 1.72 1.79 1.83 
Detroit 1.76 2.40 2.85 3.12 2.41 2.51 2.59 
Cleveland 1.34 2.50 2.86 3.14 1.79 1.94 2.07 
Pittsburgh 1.04 2.52 2.87 3.35 1.56 1.55 1.70 
Buffalo 1.15 2.53 2.86 3.19 1. 46 1.69 1.81 
Baltimore 1.40 2.69 2.87 3.21 1.73 1.93 2.10 
Washington 1.81 2.66 2.86 3.23 2.66 2.46 2.58 
Philadelphia 1.48 2.76 2.86 3.24 1.74 2.01 2.18 
Boston 1.30 2.78 2.86 3.31 1.78 1.77 1.92 
New York 2.09 2.72 2.81 3.25 2.62 2.81 2.91 
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TABLE 12 (continued) 





























T + eee 
log Xy log Xp log Xe log Xp log Y | log Yupc | log Yup 
| 
| 
TO CHICAGO 
from: 

San Francisco 2.20 2.60 2.90 3.34 2.80 2.96 2.98 
Los Angeles 2.41 2.42 2.89 3.33 3.39 | 3.30 3.24 
Denver 1.89 2.15 2.81 3.01 2.85 2.80 2.81 
Kansas City 2.18 2.01 2.41 2.70 3.39 3.39 3.35 
Minn.-St. Paul | 2.08 1.84 2.29 2.61 3.49 |} 3.38 3.29 
St. Louis » ee 1. 86 2.00 2.47 3.55 | 3.54 3.4] 
Mi lwaukee uve 1.75 j 1.95 3.50 3.48 3.32 
Dotseit 2.20 | 1.97 | 1.85 | 2.43 | 3.56| 3.67 | 3.54 
Cleveland 2.18 2.23 2.18 2.54 3.32 3.38 3.45 
Pittsburgh 1.94 2.29 2.36 2.67 3.09 2.99 3.08 
Buffalo 1.79 2.29 2.45 2:73 237 2.76 2.87 
Baltimore 1.88 2.47 2.$2 2.84 2.57 27 2.91 
Washington 2.08 2.40 2.54 2.84 2.95 3.04 3.15 
Philadelphia 2.20 2.58 2.62 2.88 2.99 3.08 ey 
Boston 2.08 2.64 2.80 3.00 2.88 2.83 3.05 
New York 2.76 2.53 2.70 2.92 3.83 3.77 3.92 

FROM CHICAGO 

to: 

San Francisco 2.44 2.58 2.56 3.34 3.47 3.41 3.27 
Los Angeles 3.05 2.53 2.56 3.33 4.21 4.19 4.01 
Denver 1.96 1.99 2.33 3.01 3.09 3.15 2.90 
Kansas City 2.06 1.80 2.24 2.70 3.30 3.40 3.23 
Minn.-St. Paul | 2.05 1.48 1.94 2.61 3.51 3.64 3.25 
St. Louis 1.97 1.48 2.00 2.47 3.35 3.52 3.24 
Milwaukee 1.82 1.04 1.93 1.95 3.50 3.55 3.38 
Detroit 2.49 1.91 2.20 2.43 3.74 3.90 3.89 
Cleveland 2.08 2.32 2.29 2.54 3.40 a2 3.34 
Pittsburgh 1.79 Sy 2.48 2.67 2.77 2.80 2.90 
Buffalo 1.89 2.18 2.66 2.73 2.78 2.85 2.99 
Paltimore 2.23 2.41 ee 2.84 2.86 3.02 3.21 
Washington 2.54 2.39 2.73 2.84 3.38 3.54 3.70 
Philadelphia 2.20 2.55 2.66 2.88 2.95 3. 08 3.27 
Boston 2.03 2.60 2.66 3.00 2.80 2.85 2.99 
New York 2.82 2.56 2.58 2.92 3.94 3.88 3.99 
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TABLE 12 (continued) 
log Xy | log Xp | log X | log Xp | log Y | log Yue logy 
TO NEW YORK 
from: | 
Sah Francisco | 2.38 2.91 2.89 3.49 3.26 3.06 3.11 
Los Angeles 2.60 2.81 2.88 3.46 3.60 3.38 3.39 
Denver 2.09 2.72 2.81 3.25 2.62 2.81 2.91 
Kansas City 2.36 2.68 2.72 3.08 2.92 3.20 3.34 
Minn.-St. Paul | 2.26 2.66 ate 3.10 2.93 3.08 3.20 
St. Louis 2.30 2.65 2.66 2.98 3.11 3.16 3.33 
Chicago 2.82 2.56 2.58 2.92 3.94 3.88 3.99 
Milwaukee 1.97 2.60 2.63 2. 96 2.67 2.78 2.94 
Detroit 2.38 2.43 2.45 2.80 3.37 | 3.44 3.53 
Cleveland 2.36 2.36 2.34 2.70 3.41 3.49 3.57 
Pittsburgh 2.15 2.32 2.23 2.57 3.52 3.29 3.40 
Buffalo 1.98 2.20 2.25 2. $7 3.22 3.12 3.19 
Baltimore 2. 08 2.19 1.92 2.28 3.48 3.39 3.49 
Washington 2.26 2.12 2.04 2.35 3.93 3.60 3.67 
Philadelphia 2.40 1.98 1.46 1.94 3.94 4.08 4.09 
Boston 2.26 2.07 2.09 2.36 3.81 3.60 3.66 
FROM NEW YORK 
to: 
San Francisco 2.57 2.89 2.88 3.49 3.59 3.30 3.34 
Los Angeles 3.18 2.84 2.89 3.46 4.28 4.09 4.09 
Denver 2.08 2.69 2.90 3.25 2.84 2.77 2.90 
Kansas City 2.19 2.62 2.80 3.08 2.83 2.98 3.13 
Minn.-St. Paul | 2.19 2.59 2.78 3.10 2.92 3.00 3.12 
St. Louis 2.10 2.57 2.80 2.98 2.85 2.89 3.09 
Chicago 2.76 2.53 2.70 2.92 3.83 3.77 3.92 
Milwaukee 1.94 2.51 2.74 2.96 2.68 2.74 2.90 
Detroit 2.62 2.35 2.65 2.80 3.57 3.69 3.82 
Cleveland 2.20 2.20 2.62 2.70 3.23 3.24 3.38 
Pittsburgh 1.92 2.13 2.62 2.57 3.12 2.92 3.12 
Buffalo 2.00 1.92 2.42 2.57 3.32 3.19 3.22 
Baltimore 2.26 1.95 2.43 2.28 3.60 3.63 3.71 
Washington 2.67 1.97 2.12 2.35 4.00 4.14 4.16 
Philadelphia 2.33 1.52 1.89 1.94 3.82 4.00 4.00 
Boston 2.16 1.67 2.01 2.36 3.63 3.67 3.54 
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TABLE 13: Migrants Observed and as Predicted from Two Models; with sub-totals for groups of cities. 
T T 
Predicted Predicted 
from from 
Intervening | Predicted Intervening Predicted 
Observed | Opportunities from Observed | Opportunities from 
and Distance and Distance 
Competing Competing 
\hgrants | ‘hegrants 
ai ‘= 
| } 
TO LOS ANGELES} FROM LOS ANGELES 
FROM ID 
San Francisco 11,667 14, 479 8,318 San Francisco 11,934 414 A4 
Denver _5, 032 4, 26¢ 1,905 Denver L 1,083 7] 389 
16,699 | 18,745 10,223 | [13,017 2 3, 273 
Kansas City 7, 319 5, 888 3,311 Kansas City 735 537 437 
Minn.-St. Paul | 5,103 3, 802 2,239 Minn. -St. Paul 636 513 38 
St. Louis 3, 945 4,169 2,570 St. Louis 452 372 302 
16, 367 13,859 8,120 1,823 1,422 1,119 
Chicago 16,251 15, 49 10,230 Chicago ?, 44] 1,99 73 
“h lwaukee 1, 408 1, 236 455 Mil waukee 234 ] 182 
[17,659 16, 720 11, 185 2,675 2, 191 1, 92 
Detroit 4, 261 3,311 2,81 Detroit A 1,122 1, 09% 
Cleveland », 97 2,818 2,57 Cleveland 314 339 347 
Pittsburgh 544 1,413 1,349 = Pittsburgh 169 145 15] 
Puf falo | __-872 89] 51 __ Buffalo |___127 182 Be 
656 , 433 7, 58S } 1,599 1, 7BE ( 
+ 
Ral timore 635 1,047 1, 0% Ral timore 2 338 372 
Washingtor 1, 582 1, 82¢ 1,82 Washington 1,17 122 1,175 
Philadelphia 1,947 2, 399 2,630 Philadel phia 643 y 457 
Boston 1, 354 1,445 1, 69 Boston 33 229 3 
[ 5.5] 6,711 7,244 2, 3 2, 00% 67 
New York | 18,942 12, 3 12, 3 New York 3,982 2,39 2,454 
1 
TO DENVER FROM DENVER 
FROM Mm 
San Francisco 39] 240 204 San Francisco 1,51] 631 347 
Los Angeles 1,083 371] 389 Los Angeles 5,032 4, 2h 1,905 
| 1,474 1,111 59f 6,543 4, 8° 2,2 
L 
Kansas City | 1,365 1,122 309 Kansas City 57] 295 182 
“linn. -St. Paul 364 513 18¢ Minn.-St. Paul ] 257 15] 
St. Louis 462 447 204 =| St. Louis 197 152 117 
| 2,191 N82 499 948 704 450 
L 
Chicago 1,218 1,413 794 Chicago 714 ] 647 
Mi | waukee 127 102 74 Mh l waukee 53 62 68 
I}. 345 1.515 RHA 767 69 715 
a 
Detroit 214 204 204 Detroit 259 324 389 
Cleveland 125 162 19] Cleveland 62 7 lif 
Pittsburgh 7 7 oR Pittsburgh ¥ 3 5 
Buffalo s 42 Af 5 } Buf falo 29 49 65 
473 489 551 f 495 622 
Baltimore | 44 5 79 Ral timore »4 5 12 
Washington | 251 93 23 Washington 453 288 382 
Philadelphia 114 123 178 Philadelphia 55 125 151] 
Boston L 83 72 1] Boston 6 59 83 
[492 343 490 | [622 557 74) 
New York [ 686 589 794 New York | 413 645 313 
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TABLE 13 (continued) 
eee : reer — a = - me - 
| [ Predicted Predicted 
| from from 
Intervening Predicted Intervening | Predicted 
Observed | Opportunities from Observed | Opportunities from 
and Distance and Distance 
| Compet ing Competing 
| Migrants Migrants 
shee 
TO CHICAGO | FROM CHICAGO 
FROM TO 
San Francisco 635 912 955 San Francisco 2, 928 2,570 1,862 
Los Angeles 2,441 1,995 1,738 Los Angeles 16,251 15, 490 10,230 
Denver 714 631 647 Denver 1,218 1,413 794 
[3,790 3,538 | 3,340 | [~ 20, 397 19,473 12, 88¢ 
Kansas City 2,433 2,455 2,239 Kansas City 2,011 2,512 1,622 
Minn. -St. Paul 3, 058 2,399 1,950 Minn.-St. Paul 3,245 4,365 1,778 
St. Louis 3, 516 3, 467 2,571 St. Louis 2,231 3, 311 1, 738 
~9, 007 8,321 | 6.760 | 7, 487 10,188 | 5,138 
Milwaukee 3, 158 3, O2L ?, 090 Milwaukee 3, 187 3, 548 2, 399 
| 
Detroit 3,606 | 4,677 3, 467 Detroit 5,448 7,943 7,763 
Cleveland 2,083 2,399 2,818 Cleveland 1,482 1,862 2, 188 
Pittsburgh 1,221 977 1,202 Pittsburgh 594 631 794 
Buffalo 535 57¢ 741 Buffalo 606 708 977 
[7,445 | 8,629 | 8,228 | 8, 130 11,144 | 11,722 
Baltimore 368 589 813 Baltimore 730 1,047 1,622 
Washington 882 1, 09€ 1,413 Washington 2, 383 3, 467 5,012 
Philadelphia 985 1,202 1,862 Philadel phia 895 1,202 1,862 
Boston | 757 | 67¢ | 1,122 | Boston 636 708 977 
| 2,992 3, 563 5,210 4,644 6,424 | 9,473 
b 4 a 4 
New York 6,790 5, 888 8,318 New York 8, 130 7, 586 9,773 
4 + } + 4 } 
TO NEW YORK FROM NEW YORK 
FROM TO 
San Francisco 1,828 1,122 1, 288 San Francisco 3,857 1,995 2, 188 
Los Angeles 3, 982 2,399 2,454 Los Angeles 18,942 12, 300 12, 300 
Denver | 413 | 645 | 813 | Denver 686 589 | 794 
| 6,223 | 4,166 | 4,555 23, 485 14,884 | 15,282 
Kansas City 823 1,585 2, 188 Kansas City 682 955 1,349 
Minn. -St. Paul 856 1, 203 1,585 Minn.-St. Paul 840 1,000 1,319 
St. Louis | 1,279 1, 446 | 2, 138 | St. Louis a 702 | 777 | 1, 230 
| 2,998 4,234 | 5,911 |_ 2,224 | 2,732 | 3,898 
Chicago | 8,685 7, 58€ 9,773 Chicago 6,790 5,889 8,318 
Mil waukee L 468 603 | 871 | Mi lwaukee | 481 550 | 794 
| 9,153 8, 189 10,644 7,271 6,439 9,112 
t- = 4 } } } 
Detroit 2, 356 2,575 3, 389 Detroit 3,747 4, 898 6,607 
Cleveland 2,593 | 3,09] 3,716 Cleveland 1, 686 1,738 2, 399 
Pittsburgh |} 3,301 | 1,950 2,512 Pittsburgh 1,307 832 1,318 
Buffalo L 1,671 i 1, 319 | 1,549 |Buffalo L 2,074 | 1,549 1,660 
9,921 | 8,935 | 11,166 8,814 9,017 ‘| 11,984 
ee eee T .. + S eneneeeniedl F =s pomelen + 
Baltimore 3,009 2,455 3,091 |Bal timore | 4,011 4, 266 5, 129 
Washington 8, 562 3,982 4,678 |Wash ington | 10,003 13,800 | 14,460 
Philadelphia 8,624 12,030 | 12, 300 |Philadelphia 6,620 | 10,000 } 10,000 
Boston 6,500 3,982 4,571 |Boston | 4,306 | 4,677 3, 467 
ae | 26,695 [22,449 “ha — _ [24-940 32,733 133, 056 
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MODELS OF TRANSPORTATION AND 
OPTIMAL NETWORK CONSTRUCTION + 


by Richard E. Quandt* 


1. INTRODUCTION 


The standard transportation model in linear programming speci fies 
n sources (factories) and m destinations (warehouses) of a conmodity and 
is designed to give a least-cost shipping schedule subject to the 
restrictions that no source supplies the commodity in quantities in 
excess of its stated capacity and no destination receives less of the 
commodity than its stated requirement. In this formulation of the 
model it is assumed that direct shipments from any source to any 
destination are feasible and more economical than transshipments through 
other sources or destinations.“ In addition, this model requires the 
assumption that the routes or arcs between any pair of nodes consisting 
of a source and a destination have themselves infinite capacity to 
accommodate traffic. The model can therefore be represented geometri- 
cally as a graph in which every point is connected by an are with 
infinite capacity to every other point. 


In general it is clearly not true that the arcs connecting two 
points in space have infinite capacity. There is some maximum number 
of trucks per hour or day or year that a given highway can accommodate 
or a maximum number of railroad trains (the length of which may be 
bounded by the maximum possible power output of available engines or by 
terminal facilities) that can run on a given stretch of railroad track 





tI am indebted to A. Doig, R. Gomory, G. Morton and S. Vajda for comments. 
The responsibility of errors is, of course, mine. 


*The author is Assistant Professor of Economics at Princeton University. 


I see forexample Dorfman R., Samuelson, P.A.,Solow, R.M. LINEAR PROGRAM- 
MING AND ECONOMIC ANALYSIS (New York: McGraw-Hill, 1958), Ch. 5. 


2For a treatment of the transshipment problemin which this assumption is 
relaxed see Orden, A., “The Transshipment Problem,’’ MANAGEMENT SCIENCE, 
2(1956), 227-285. 
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per unit of time.? Route capacities are fixed, however, only in the 
short run. Highways can be widened (or where no highway exists at all, 
one can be built) and additional railroad tracks can be laid. 


Consider the case of a state or national authority for the con- 
struction of highways. The objective of highway construction is to 
facilitate the transportation of goods and persons. Since shippers can 
be assumed to seek a least-cost solution to the transportation problem 
for a given highway network, it is not unreasonable to assume that the 
total cost of transportation provides a measure of the success with 
which the network fulfills its task. If it is possible to increase 
route capacities by new construction, one may treat the cost of con- 
struction in two alternative ways. One may either assume that the 
legislature has already voted a certain sum for highway construction, 
in which case the cost of construction becomes a constraint which one 
desires to utilize such that the greatest saving in transportation 
costs is effected or, alternatively, one may take the view that the 
(yet undetermined) cost of construction is ultimately paid for by the 
shippers through gasoline and other taxes, inwhich case the construc- 
tion cost (in some sense) becomes part of the minimand. 


The objective of this paper is formally to set up, and explore the 
implications of, some models designed to deal with problems of this 
type. For argument’s sake the problem will be dealt with from the 
point of view of a national highway authority intended to build high- 
ways to satisfy future demand. Since such planning must be based on 
forecasts of various kinds, difficult statistical problems of estimation 
may arise. These will be assumed away for the sake of illuminating the 
structural aspects of the problem. It is thus assumed that future 
quantities are known with certainty. To simplify the problem still 
further, it will also be assumed, unless specified otherwise, that 
transportation services are needed for transporting a single type of 
homogeneous commodity. It will be obvious that the relaxation of this 
assumption involves no conceptual problems, only wear and tear of paper 
and pencil. 


The models discussed in the following sections do not aim at 
generality in the sense in which general equilibrium models in location 
theory do. They do, however, aim at incorporating capital costs in the 
model and thus attempt to take into account arather neglected element. 





31 must be noted that this assertion represents a considerable over- 
simplification, although it may be acceptable as a first approximation. 
Traffic studies have heen that road capacities depend on the speed of 
vehicles, the minimum safe distance between vehicles, opportunities for 
assing, etc. See Beckmann, M., McGuire, C.B., Winsten, C.B., STUDIES 
N THE ECONOMICS OF TRANSPORTATION (New Haven: Yale University Press, 
1936), Cha. I. 


4In Lefeber, L., ALLOCATION IN SPACE: PRODUCTION TRANSPORTATION AND 
INDUSTRIAL LOCATION (Amsterdam: North Holland Publishing Co., 1958) the 
roblem of paswess construction receives only cursory attention (pp. 
fio- 111), althou the author states (p. 10) that it is not necessary 
thata direct arcs Be aa every two points. See also Beckmann, 
McGuire, Winsten, op. cit., pp. 107-108. For an approach related to that 
of the present article see Garrison, W. L. and Marble, D. F., “Analysis 
of Highway Networks: ALinear Programming Formulation’’. HIGHWAY RESEARCH 
BOARD PROCEEDINGS, Vol. 37, 1958. 
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2. SIMPLE MODELS 


Consider a geographic space consisting of n + m spatially dis- 
tributed nodes. Let the first n be sources and the remaining m be 
destinations.” Let pj; be the (constant) unit cost of transporting 
goods from source i to destination j. The assumed constancy of p; 
implies the absence of congestion problems.© Define x; as the amoun 
transported from i to j; c;; as the existing capacity of the arc from 
i to j; ke as the planned increase in the arc-capacity from i to j; K; 
as the shipping capacity of the ith source, and R. as the requirement 
of the jth destination. Clearly the dimension of zi, Ci} kj 5) K;, R; 
is tons per unit of time and the dimension of p;; is dollars per unit 
of quantity. If one imagines that the constructidén program is financed 
by the sale of perpetual bonds, one can define r;. as the cost per unit 
of time of increasing the arc-capacity from i to’j by one ton per unit 
of time and r:. as the entire or capital cost of a unit increase in the 
arc-capacity from i to j. We then have the relation %,° r:.r where r 
; , j ij 
is the market rate of interest. 


MODEL 1. Consider thecase in which the cost of new construction 
is imputed to the shippers (since in the final analysis they will pay 
for highways through taxes). The constraints that a building and ship- 
ping program must satisfy are 


n+m 
o ij? -K, et ee (2-1) 
j=n+l 
n 
2 R =<n+] + (2-2) 
Xi j i pee aeeeee soe = 
i=] 
ae ee eee =], oe N 
ij ij ij ioe * hie Oe (2-3) 
2 2 - 
k; j = 0, xij nad 0 Bye n (2-4) 


jznel,....n +m 





SSince we are dealing with the transportation of a single homogeneous 
commodity, every node must be either a net exporter or a net importer. 


The present models have this in common with the orthodox transportation 
model. The difference is that in the present case constant cost ship- 
ments are possible only up to the capacity of the arc from i to j and 
beyond that not at all, whereas in the orthodox transportation problem 
constant-cost shipments from i to j are possible in any volume not in- 
consistent with the capacity of i. To take into account the problem of 
congestion P.: would have to be made a function of the degree of utili- 
zation of the are-capacity from i to j. This would make the problem 
nonlinear. 
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Constraint (2-1) expresses the restriction that no source can supply 
more than its capacity. Constraint (2-2) expresses the restriction 
that the requirements of every destination be met. Constraint (2-3) 
states that (future) shipments along a particular arc cannot exceed the 
existing plus planned (or rather newly constructed) capacity of that 
arc. Of course, some of the Ci j may be zero. Finally, (2-4) expresses 
the usual nonnegativity conditions. 


The objective function to be minimized is the sum of two sets of 


n n+m 
of elements. The first part is > } , Pi j%ij and expresses the 
i=l j=ntl 


current or operating cost of the shipping program. Its dimension is 
dollars per unit of time (i.e., annual shipping cost). The second part 
is the cost of the construction program. Unfortunately certain diffi- 
culties arise in connection with introducing the cost of construction 
into the objective function. It may be desirable to consider only the 
current costs of the construction program (i.e., the interest charges) 
in order to make them comparable with the shipping costs which are, by 
definition, current. The cost of construction would then enter the 


n n+m 
minimand as > > Ti jkij and the objective function would be 
i 6g 


j=n+] 


n n+m 
> > (Pi 5%i; ° r; kj j) (2-5) 
i=l j=ntl 


There are at least two difficulties with this formulation. First, 
since the dimension of Pi is dollars per unit of time per unit of 
n+m 


n 
quantity, the dimension of > : } ry jk; is dollars per unit of 


i=l jentl 
time per unit of time which is not the same as the dimension of 
n+m 


n 

> > Pi j*ij- Secondly, (2°5) involves the minimization of the 
i=l j=ntl 

sum of shipping costs and a fraction of the actual construction cost 


without any rational mechanism at hand for determining what fraction of 


the total cost. In other words, the objective function is not inde- 
pendent of the time period under consideration. To illustrate: assume 





It should be noted that the problem is essentially discrete. Additions 
to roads are not continuously variable but measured, perhaps, in terms 
of indivisible “lanes”; hence additions ro road capacity (k; 5) are also 
discrete. 
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that we are given the capacities and requirements for two successive 
years. If we calculate an optimal program for the first year and then 
an optimal program for the second on the assumption that the first 
year’s program has been carried out, the result will generally differ 
from the solution of a single two-year problem in which capacities and 
requirements are defined with reference to a two-year period. 


Denote the first year’s program by I and the second year’s by II. 
Assume for simplicity that the capacities K;, the requirements R., and 
the various prices are the same in the two years. Problem I can be 
stated as follows: 


minimize p’x + r’k (2-6) 
. > 
subject to Ax = 5s (2-7) 
-x+k#-c (2-8) 
x2 0,k 20 


where lower case letters denote column vectors (row vectors if primed) 
and A is the matrix of coefficients corresponding to (2-1) and (2-2). 
Assume that (x°,k°) minimizes (2-6) subject to the constraints. Then 
problem II is to 


minimize p’x + r’k (2-9) 
subject to Ax 2 s (2-10) 
-x+k#-c- k?® (2-11) 
x 2 0,k 20 
We now prove the following: 
Lemma: If (x°,k®°) is an optimal solution of I, (x°,0) is an 


optimal solution of II. 


Proof. Assuming the contrary, let (x*,k*) # (x°,0) be an optimal 
solution of IJ. Then (x*,k*) is feasible for II and thus 


ny 


Ax* s 

- x* +k*2 -c-k?® 
or 

- x* + (k* + k°) 2- 


Defining k = k* + k°, it is clear that (x*, k) is a feasible solution of 
I. Since (x°,k°) is optimal for I, 
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‘. ' 


p’x® + r’k® ° p'x* + r’k = p’x* + r’k* + r’k® 
or 
4_o < 1s ‘Ls 
px =px* +rk (2-12) 


The vector (x°,0) is clearly feasible for II since 


nV 


Ax° s 


> 
-x° =-c-k°® 


follows from the feasibility of (x°,k°) for I. If (x*,k*) is optimal 
for II, 


p’x* + r’k* - p’x° ( 2-13) 


which implies, together with (2-12)), that p’x* + r’k* = p’x®, i.e., 
that (x°,0) is an optimal solution. 


The minimum cost of the shipping and construction programs for two 


years calculated piecewise is therefore 2p’x® + r‘'k°. If we consider 
two years as the unit of time, the K;, R;, Cij> Xj ki; become 


multiplied by 2.8 Constraints (2-2) to (2-4) remain therefore the 
same. But the cost of construction per unit along the ijth arc becomes 
2r;. (two interest payments being involved),” whereas the unit cost of 
shipping remains Pij- Thus the objective function becomes 


p’x + 2r’k (2-14) 


The solution which minimizes (2-14) is not generally the same as 
(2x°,2k°) which it would have to be in order to provide the same 
program as the alternative method.!° We have thus proved the following 


Theorem: If the minimand includes the annual cost of the con- 
struction program, the optimal solution is not independent of the 
length of time for which the variables are defined. 





8i¢ a source can supply 100 tons per year, it can supply 200 tons per 
two-year period; if a route capacity is 50 tons per year, it is 100 
tons per two years, etc. 


We abstract from any difficulties of compounding. 


10th. optimal solutions of I and II are (x°,k°) and (x°,0). The total 
amount of shipping is therefore 2x° fora two-year period. The construc- 
tion in the first year of new highways with capacity k° provides addi- 
tional transportation capacity of 2k° tons per two-year period. The two 
methods give the same solution if k° = 0 in which case there is no cap- 
ital program and the problem does not arise. 
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The alternative method is to write the objective function as 


> , y y Pij%ij * bij kij) (2-15) 


j=ntl 


where r. is the entire cost of increasing the arc-capacity from i to }j 
by one unit. The formulation avoids both of the —_ difficulties: 


the dimension of ri iki; is the same as that of p; i j*ij and the optimal 


program is independent of the time period for which the variables are 
defined. But unfortunately this formulation gets into other equally 
serious difficulties. The justification for putting construction cost 
in some form into the objective function was the statement that the 
shippers would ultimately pay for the new roads. Writing the minimand 
as in (2-15) is then equivalent to charging the entire cost of con- 
struction to users in the first year. This might be appropriate if the 
new road “wore out” within the year but not if it has a longer life 
expectancy. Since neither of these formulations is satisfactory, it is 
clear that a different model is required. 


MODEL 2. Consider the case in which the legislature has already 
appropriated the sum of M dollars for highway construction. The 
objective then is to 


minimize 2 3 Pj ;* 


j=nt+l 


subject to 


ny 


-K Bt Jccogt ( 2-17) 


MM: 
bad 
»- 
Wy 
a 
" 
=] 
+ 
— 


yn+m (2-18) 
j=l 
Jt ~ Mss 2 - Ci j “eae eee (2-19) 
S884 Bove gh +O 
n n+m 
» Z. risk; 2-M (2-20) 
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where the last constraint expresses the restriction that the total 
amount spent on construction must not exceed the appropriated amount, 11 
Model 2 is clearly more satisfactory conceptually than either formula- 
tion of Model 1, although the presence of constraint (2-20) increases 
the likelihood that no feasible solution exists. 


The interpretation of the dual is relevant and interesting in its 
own right. Let the vector of dual variables be (ujv/wi/t) where ujv, 
and w’ represent row vectors with n, m, and nm elements and where t is 
a scalar. The dual problem is to 


n n+m n n+m 
im] - . ¢ R. = . — 9.9 
maximize } u;K; + ) vjR; > > Wi j°ij tM(2- 21) 
i=l 


j=nt+l iz] j=nt+] 
subject to -u: + Vv: - Ww; $ iz] n (2-22) 
J 1 J 1) Pij eee,» &” oS 
jznel,...,n +m 
se: ~ -. $ 0 ‘ee! n ( 2-23) 
Jj 1) pee ey a-Z 


jzne+tl,...,n +m 


The dual variable u; can be interpreted as the imputed f.o.b. price at 
the ith source, v. as the imputed delivered price at the jth destina- 
tion, w;; as the “imputed value of a ton’s worth of arc-capacity along 
the ijth* route (i.e., the imputed highway toll for using the ijth arc) 
and t as the imputed price of money or, rather, as the imputed rate of 
interest. The variable t measures the reduction in transport costs 
that would be effected if an additional dollar were available for 
construction. It is noteworthy and not unexpected that t is dimension- 
less by virtue of the fact that (2-21) has dollar dimension; hence the 
interpretation of t as the imputed rate of interest isnot unreasonable. 
Constraints (2-22) require that the net profit after tolls on each 
route not be in excess of the actual cost of transportation on the 
route, and (2-23) requires the toll charge to be less than or equal to 
the interest charge for each route. These constraints express the 
usual zero profit conditions: corresponding to a primal activity 
actually used profits will be zero and primal activities corresponding 
to strict inequalities in the dual system (negative profit) will not be 
used. The maximization of (2-21) is equivalent to the maximization of 
imputed net profit. 


The model permits in an indirect way the solution of the problem 
of now to determine the optimal expenditure on route construction. As 
was stated before, the dual variable t is the imputed rate of interest, 





lly the cost of construction is covered by borrowing, M represents the 
amount appropriated annually in perpetuity to cover the interest charge. 
In that case te must be replaced by rij in (2-20). 

J 
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i.e., the improvement inthe objective function (lasting into perpetuity) 
which would be caused by a one-dollar increase in spendable funds. It 
is then relevant to compare this t with the market rate of interest r. 
If t > r it is economical (and beneficial to welfare if the assumptions 
of the model hold) to borrow an additional dollar for construction. If 
t <r, it would be profitable to repay part of the debt incurred for the 
sake of construction. Of course, it is irrelevant for the argument 
whether the road construction authority borrows directly in the market 
for funds or is financed by an appropriation by the legislature. In either 
case the decision criterion involves the use of an opportunity cost 
principle. 


3. THE TRANSSHIPMENT PROBLEM 


All shipments in the previous models were direct ones. The 
restrictive nature of such models is well known. A direct arc from i to 
j may not exist and it may be too expensive to build a direct link. 
Moreover, it may be more economical to ship from i to j via some inter- 
mediate nodes (other sources or destinations) than to ship directly. 
Since every shipment places a burden on the finite capacity of some arc 
or arcs, it is necessary to keep track in the model of the routes which 
various shipments take. This can be accomplished in at least two ways 
described under Models 3 and 4 respectively. 


MODEL 3. One can keep track of all shipments by a complete 
enumeration of the possible routes that a shipment from i toj may take. 
Denote by xi j(0..0) OF simply by x; j the amount shipped directly from 


i to j. hake by xX. the shipment from source i to desti- 
ij(k,.--k yoy) 

nation j which goes from i to k, (k, = .,n+m but # i or j), from 

k, to ko (ko = ],...,n+¢m but ; i a he or ky ), etc., and finally goes 

from Ki+m-g tO jJ- The quantity *ij(k, —_—— thus represents an 


indirect shipment from i to j which passes through all other nodes as 
well. The shipment which passes through h intermediate nodes (h < n+m-2) 
can be represented as Xi i(k, ...k,0...0) OF Xi j(k)- ky) for short. The 


total number of indirect shipments from i to j going through one inter- 

mediate node is n + m - 2; the total number going through two intermed- 

late nodes is (n + m--2) (n + m - 3); the total number through h 

intermediate nodes is (n + m- 2)!/(n + m- 2 -h)!. Therefore the 

total nvnber of possible direct and indirect shipments from i to j is 
n+m-2 


(n + m- 2)! i/(n +m- 2 - j)!. Obviously, no shipment is 
j=0 

routed through the same node twice; that is no loops are permitted since 

activities involving loops are clearly inefficient. The unit cost of 

shipping along the route ij(k)..-k,,,-9) is Pij(k)...k y+ This quan- 


n+m~2 


tity may equal ) P;-) ; or exceed it in the case when transshipment 
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involves reloading or delays at one or more transshipment points. The 
n+m-2 
unit cost may be less than ) p;-; ; if tariffs are set in such a 
i=2 
way that long hauls are relatively cheaper than short hauls. 


Two formulations suggest themselves according to whether we regard 
the various arcs as directed or undirected. 


The undirected case. Define by Ss. the set of permutations of the 
n +m - 2 nodes other than i and j taken g at a time. Also define Sig 
to be the subset of Ss consisting of the permutations which have j for 


first element; Sa to be the subset of Ss. consisting of the permuta- 
tions which have i for last element; ij 5g. 
sisting of the permutations in which i and j are adjacent and in the 
given order. Abbreviating (ky..-k,) by (k),, the objective is to 


to be the subset of s con- 


n+m-2 


n n+m 
minimize > } ) ) Pid), ij (k), (3-1) 


i=l i=n+ k) ij 
1 j=ntl g=0 ( ),€ Ss. 


n+m n+m-2 


subject to - ) } > Xij(k), S-K; ; 7 (3- 2) 
= = ij 
jzntl g=0 (k) ,¢ S. 


Vv 


n n+m-2 


. > 
> Xij(k), = RK, =n+1],...,n+m (3-3) 


i= = k 3 iJ 
i=] g=0 ( ),f Ss; 


n n 
} rij at =-M (3-4) 
i=l 5; j=l 


= 
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n+m n+m-2 n n+m-2 
~ %5 > : > : “ir(k) | -> » » } *sj(k) 
on — ir oa - s 
r=n+] g=l (k) ,¢ Sie s=l g=1 (k) ,¢ sa 
r#j s#i 
n n+m n+m-2 
> ; 

- ) > > > Xsr(k), + k; =-c,; i=l,...,n (3-5) 
s=l r=n+l g=2 (k) 9 32n¢1,...,n¢m 
s#i rf) sr U s*F 

ij 6 ji’s 

and 
n+m n+m-2 n+m n+m-2 

-) } > : > : *sj(k) ~ > » » “rj(k), 

- j - me rj 
j=nt+l g=l (k) ,¢ j=n+l = g=1 (k) ,¢ Sis 
k 2 = (3-6 
‘SF <<, SL. .«ces@ 3 -6) 
at eee 
s<r 
n n+m~2 n nt+m-2 

-) } , , 7 “iq(k) -» ) > } > } *ip(k) 
j= = 1q i= = ip 
j=l g=1 (k) ve * i=] g=l (k) -¢ ae 

+k 2-c p *n+tl ntm-] (3-7) 
Pq Pq ae 
q=n+2,...,n+m 
p<q 
> > > 


This model is called the undirected case because the arc between i 
and j is characterized by a single capacity number c;., and no distinc- 
tion is made between shipments from i to j and from j fo i. If the ijth 
arc-capacity is increased by one unit, this increase in capacity can be 
used to accommodate traffic in either direction. 


The directed case arises when providing capacity for shipping from 
i toj isnot equivalent to providing capacity in the opposite direction. 
Imagine, for example, that a divided highway already exists between i 
and j with two lanes in each direction. If a third land were built on 
the side corresponding to traffic from i to j, it could not be utilized 
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for traffic going from } to i without creating serious traffic hazards. 
In order to deal with this problem constraints (3-5), (3-6), and (3-7) 
must each be replaced by a pair of constraints which creates no new 
conceptual difficulties. 


The obvious disadvantage of Model 3 is the extraordinarily large 
number of activities it requires for any network with a reasonable 
number of nodes. Even if certain activities are ruled out a priori (we 
may never allow, for example, a shipment originating in Chicago wich 
final destination Los Angeles to go via New York), the number of 
activities is likely to remain toolarge. However, the problem can be 
reformulated in a practical fashion at virtually negligible sacrifice 
by following Orden’s solution of the transshipment problem. 


MODEL 4. Given an ordinary transportation problem to 


n n+m 
minimize ) > Pij %ij 


i=] j=n+l 
subject to 
n+m 
-)> a, eK, a = i, n 
j=n+l 


R 
J 


7 
" 
=] 
+ 
2 


eee nl +m 


td 
~- 
en 
" 


Wy 


the corresponding transshipment problem can be solved by assuming that 
each potential transshipment point possesses very large inventories of 
the commodity. By considering each node to be the union of two “hal f- 
nodes’’ one of which only receives from the outside and the other of 
which only ships to the outside, with zero transport cost between them, 
the transshipment problem is solved by solving the following transporta- 


tion problem: 
ntm n+m 
minimi ze > ) Pij%ij (Pia = 9) (3-8) 
i=l j=l 





12 


Orden, A., op. cit. 
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subject to 


n+m 
7 > m%,°- 8, -G Cee 2 (3-9) 
j=l 
n+m 
} %,* R; +G 4 @ beet * (3-10) 
i=l] 
Xi j 2 0 
n+m 
where G2 > R,, R,=0 for j=1,...,nandK, =0 for i=n+1,...,n +m, 
j=ntl 


Equations (3-9) and (3-10) are written as equalities above on the as- 
sumption that the appropriate disposal activities have already been 
introduced. One may then add the restrictions 


-%; ¢ Kj = Ci; i, j= 1,....n +m (3-11) 
ifj 
ném n+m 
- > > rijki; 2-M (ry, = 0) (3- 12) 
i=l j=l 


This completes the formulation of Model 4 which is an qytension of 
Model 2 to the case of transshipment and directed arcs. 3 The only 
sacrifice involved in Model 4 (as opposed to Model 3) is the fact that 
in Model 3 the unit cost of shipping from i to j may be made to depend 
upon the position of the ijth segment in an entire route. This is 
surely a modest cost in exchange for a program of feasible dimensions. 


4. REMARKS ON MISCELLANEOUS PROBLEMS 
Numerous other problems may have to be considered in order to make 


the various models more realistic. A method of handling several 
problems is suggested below. 





13 The case of undirected arcs can be handled equally easily. Replace 


(3-33) Wy = «., © 4.. $8... 2 * €.: Bebe FS isescc ge © & 253 oe 
ij ji ij - lj» 
n+m-1 n+m 
(3-12) by ai T5 ki | = M. 
“a. ., 
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INTERNODE JUNCTIONS. It is obviously possible in a given 
network that three routes meet at a point which is neither a source nor a 
destination. Alternatively, one may believe that if an arc connecting 
i’ and k’ is to be bui lt at all, it should emanate from some stated 
point along the i’j’th arc rather than from i’ itself. This problem 
can be solved in terms of the previous models by regarding such junc- 
tions as (fictitious) nodes. Let subscript f be the label of this node. 
Assuming that i’ is a source and j’ and k’ destinations and that trans- 
shipment is general ly not allowed (Model 2), weknow that every shipment 
from i’ to j}’ and k’ must go through f. Then the objective is to 


n n+m 
minimize ) } Pij%ij * Pi'e*i'e + Pey'*ej! * Pek’ * ek’ 
i=] j=n¢#l 


‘ 


subject to i=ie>j#j',k 


n+m 
> 
io ) : x, © °K, 
j=ntl 
n 
) > 
xij = R 
i=] 
x ot - ae 5 = 0 


> ‘ 
~ ms hy, Fo gy, i=l, »n 
jzne+ti,....n +m 
= i'S&3j # j',k’ 
: -_ 
“Kite + kieg - Cyr 
- . 
| + kgis = Cri! 
— Xf + Key F = cee 
>) ) a r, jk “ue” Tit gkits ~ T¢i tks 5s — Tyke = - M 
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The meaning of the constraints is obvious, and the treatment of this 
problem in the transshipment case is equally easy along analogous | ines. 


MULTICOMMODITY MODELS. For simplicity’s sake consider a net- 
work of n+m nodes from and to which two commodities have to be shipped. 
Let the amounts of the first commodity shipped be denoted by Xi and 


those of the second by y;:. As regards the first commodity, i=1,...,n 
represents sources andj =n+l,...,n + m destinations. Let the number- 
ing of the nodes be such that as regards the second conmodity i= l1,...,t 
and 1 = r + l,...,n +m (t <n<r) are sources and j = t + l,...,r are 


destinations. This way of numbering the nodes allows some nodes to ship 
both commodities, some to receive both, some to ship the first and 
receive the second, and some to ship the second and receive the first. 
Denoting the unit cost of transporting the second conmodity from i to j 
by Gi j and the capacities and requirements for the two commodities by 


K*, Ry, Ky, i’, we wish to 


n n+m t r n+m r 
minimize > > Pij*ij * > > qijY¥ij * ) ) 9ijYij 
i=l j=n+l i=l j=t+l i=r+] j=t+l 
subject to 
n+m 
2 rx De 
x; * K; i *@ desea gl 
j=ntl 
n 
x. 2 j2znel, n+m 
1J J 
i=] 
r 
- 2 - KY 1=] a | n+m 
Yij i peeey Vy »* ’ 
jzttl 
t n+m 
y; + y 2 RY j= <4 2 r 
1J 1) J ; : 
i=] i=r+l 
> 
~ ME ~ ¥55 +H gE - 5 - a x 2 
jp2nel,...f 
> 
~ Xi; +k; ; e = didi l, , 
5 = = ’ ’ 
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* ae _ 
” % +k; ; _ Cij and ee 
5 2 & ¢ bococgt * @ 
- Vas + kj 2. Ci j & © Eivccgt,F * 2 n +m 
j2zte+td,....n 
- Vij + k; j t- Ci see 4 4, yn+m 
326 * i, e 
and finally 
n+m r n n+m 
> riikiy - DD Dy ki DD this 
i=l j=ntl i=r+l j=n+l i=l j=r+l 
n+m n 
-¥ > ~ ) ) ©, ck; 2. M 
Pij kj j | i 
=t+l i=rt+l. j=t+l 


Arcs are considered to be directed in this formulation. An undirected 
formulation is not very meaningful because traffic between two points 
may now flow in both directions. The extension of this model to the 
case of transshipment is obvious and need not be discussed separately. 


TERMINAL FACILITIES. The problem of terminals may not arise 
in connection with a highway network, but it is almost certain to arise 
in some forminconnection with a rail network or a municipal transport 
system. If it can be assumed that the cost of building a new or 
enlarging an old terminal at node i is proportional to the excess of 
actual traffic over the terminal’s capacity Q; (which equals zero if 
the terminal does not exist), the following restrictions have to be 
added to those of the transshipment mode |: 


n+m n+m 
> » > ; 
- i xii + T, §-Q £ * 2.000 
i=] j=l 
j#l j#fl 


where T; is the planned increase in the capacity of the ith terminal; 





14 


Since the volume of traffic in and out of a node is fixed in Model 2 
by the K. and R; , the terminal building or enlarging part of the program 
is trivial. 
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n+m 
- s;T; ~ M, 2 0 
1 


i 


where s; is the dollar cost of increasing the ith terminal’s capacity 
by one unit and M, is the amount set aside for terminal construction. 
We replace constraint (2-20) by 


ntm n+m 
2 - 
i=l i=l 


where M, is the amount set aside for roadbed or track construction, and 
finally we introduce 


> 
-M -M=-M 
where M is the total amount available for construction purposes. There 
is therefore an ultimate capital constraint given by M, but the distri- 
bution of this sum between road construction and terminal construction 
is to be determined by the program. As in Section 2 the cost of termi- 
nal construction should not be part of the objective function, and s; 
can be interpreted as total or annual cost, depending on the nature of 
the problem. 


INVOLUNTARY INTERSECTIONS. Select for consideration out of 
an entire network the four nodes labeled i, j, r, and s and assume that 
initially there are no arcs in existence connecting them with each 
other. Assume that an optimal program (based, say, on Model 4) requires 
the construction of an arc from i to j and one from rtos. If the 
geographic location of the four nodes is such that i and j lie on 
opposite vertices of a rectangle formed by the fournodes, an involuntary 
intersection will be created. The construction of an intersection 
represents a new cost not heretofore considered in the objective 
function. The rationale for devoting special attention to these costs 
is as follows. If two roads meet, special provisions must be made: an 
over- or underpass must be built and ramps may have to be constructed 
in order to allow a vehicle coming from any direction to proceed in any 
other direction. These special costs cannot be included in the cost of 
the construction of the road, since they arise only if both roads are 
built. 


The method for solving problems of this kind will be indicated 
below for a special case without an attempt at a completely general 
formulation. Consider the k;, as integral valued variables and assume 
that initially there is no arc between i and j and between r and s. 
Define the additional integral valued variables Wij Wes? te by the 
following constraints: 
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1 2, 
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-1 ¢ Y5 j ’ Fes ~ Vij,rs 


> 
= 9u. 
at ” Pus “Ui j,rs 
, — M - 
where P is a very large positive number (>max——). As can be verified 
1,J Pij 
easily, te 1 if both arcs are built and Mises ° 0 otherwise. 


One may then add the term a: ; reMij,re 


5. on is the cost of building the (ij,rs)th intersection. This type 


to the objective function where 


of treatment can easily be generalized to take into account cases in 
which one or both arcs already exist and where the special costs are 
due to widening (rather than creating) intersections. 


Some additional problems which arise from the presence of inter- 
sections andwhich canalso be handled with relative ease are as follows: 
(a) If the (ij)th arc intersects the (rs)th arc, the capacity of the 
intersection for traffic in the (ij) direction will depend on the 
volume of traffic in the (rs) direction and conversely, unless the 
intersection is of the overpass type. (b) If the intersection is of 
the overpass variety, access links or ramps permitting a vehicle coming 
from any direction to proceed in any other direction may or may not 
exist. If they exist, they must and can be explicitly allowed for in 
the mathmetical formulation. 


Clearly, the number of activities and constraints will increase 
considerably if involuntary intersections are to be considered. For 
many types of construction problems it will, however, be safe to omit 
this complication. Generally speaking this will be the case when the 
number of potential involuntary intersections is smal] relative to the 
road mileage to be built. 


6. CONCLUSION 


It has been shown that no serious conceptual problems are created 
by combining a transportation model with a road construction model, 





1s 


For amore detailed discussion see Beckman, McGuire, Winsten, op. cit., 


Ch. 1. 
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provided that the maximum amount that can be spent on construction is 
given from outside the model. It is significant that there are strong 
theoretical reasons for ‘not considering the construction cost as part 
of the objective function. The present models thus do not solve the 
problem of how much to spend on construction (except in the case in 
which the total amount M is not used up). From the wel fare- theoretic 
point of view it would be desirable to have the model determine M, the 
magnitude of which is left by the models at the discretion of, say, the 
legislature. One of the basic reasons for this is that the models are 
unable to provide criteria by which a dollar’s worth of construction 
cost saved can be compared to a dollar’s worth of transport cost saved. 


However, it is still possible to make a limited welfare judgment 
and determine the value of M. If construction costing one dollar annu- 
ally in perpetuity resulted in a saving of transport costs of more than 
one dollar annually in perpetuity, it would be reasonable to argue that 
the particular construction job should be undertaken. It would there- 
fore be relevant to compare the imputed rate of interest t with the 
actual or market rate r and to prevail upon the legislature to increase 
the appropriation whenever t > r. Alternatively one might state this 
as follows: since the imputed prices represent the improvement in the 
objective function resulting from a unit increase in the available 
supply of a scarce resource, t is the fraction of a dollar by which 
shipping costs would diminish (annual saving in perpetuity) if an extra 
dollar were available for construction. If the annual cost r of an 
extra dollar is less than t, the extra dollar should be procured. 


Various other problems are discussed in the remainder of the paper. 
Transshipment is introduced into the model and the models are extended 
to cover (1) problems raised by the existence of junctions which are 
not genuine nodes, (2) the construction of terminal facilities, (3) the 
transportation of several commodities and (4) theproblem of involuntary 
intersections. It is clear, of course, that the models discussed 
exhibit only general methods for dealing with certain categories of 
problems; in any concrete application numerous modifications and 
refinements would remain to be performed. 
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CONTRIBUTIONS TO A STATISTICAL 
METHODOLOGY FOR AREAL DISTRIBUTIONS 


by William Warntz and David Neft* 


1. INTRODUCTION 


The principal purpose of this paper is to indicate how a formal 
body of statistical methods might be developed expressly to describe 
spatial distributions and to present some of these measures and methods 
already defined. 


Let us begin by assuming a population (in the statistical sense) 
distributed over an area. What parameters are there available to de- 
scribe this distribution? If two of the basic concepts of statistics - 
deviation and frequency - can be established for spatial distributions, 
then others follow. 


In conventional statistics one may think of a deviation as the 
difference between one value of a variable and some other selected 
value, as for example, the difference between the arithmetic mean of 
a series and some value in the series. This difference is expressed, 
of course, in the same units as the values of the variable. In general 
a deviation may be considered as the difference between any two values. 

But, one can also think of a deviation as a “length” or “distan ” 
between positions on the mathematical line representing the values o- 
the variable over which the population is distributed. This idea is 
easily transferred to distributions over actual physical areas, i.e., 
spatial distributions. Here, a deviation may be thought of as the 
physical distance between the position of 2 unit of the population and 





- 
American Geographical Society, New York, N. Y. 


lin “Some Parameters of the Geographical Distribution of Population, 
GEOGRAPHICAL REVIEW, 49 (1959), pp. 270-272, John Q. Stewart and William 
Warntz offered, among other things, a brief presentation of the extension 
of the conventional measures of central tendency and dispersion to spa- 
tial distributions. 
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another specified point,? with the distance expressed in some conven- 
ient units of physical length such as miles. 


For conventional statistics, frequency refers to the number of 
units of the population falling within a specified class. For the 
spatially distributed population, the counterpart of class is a segment 
of area. Therefore, “frequency” is the number of units of the popula- 
tion falling within a specified area. The ratio of population to area 
is the familiar density of population. 


From our definition of deviation we can construct a system of pa- 
rameters to describe spatial distributions. Using densities we can 
employ the computational conveniences associated with “grouped’’ data 
as well as investigate certain models of spatial distribution. 


2. AVERAGE POSITIONS AND DISPERSION 


There are two fundamental ways in which spatial distributions may 
differ in addition to variations in the total numbers involved in their 
populations. They may differ first in terms of the particular position 
(or positions) about which the units tend to cluster, (i.e., the average 
position), and second in the way they are dispersed about this position. 


a. Mean Center 


The mean center of an areally distributed population is its “bal- 
ancing point,’’ an average position which is the exact equivalent of the 
arithmetic mean of the conventional linear frequency distribution. In 
our system this is the point where the sum of the squares of the dis- 
tances to the individuals comprising the population will be a minimum. 
Other methods of computation exist,’ and this center can be considered, 
for example, as analogous to the “center of gravity.” The mean center 
occurs at the position where the value of the following expression is a 


minimum: 
2 
r“D dA, 


where D is the density of population over an infinitesimal element of 
area dA, and r is the distance from each said element of area to the 
point in question. 





20f course one can describe areal distributions by reference to recti- 
linear x and y coordinates taken from an arbitrary origin. Deviations 
are then measured from arbitrary axes and conventional algebraic signs 
observed. This, however, regards a distribution in terms of two inde- 
pendent variables which when combined do not necessarily specify the 
distribution completely. 


3This mean center can also be found directly as that point which simul - 
taneously satisfies the requirements -- mean of the x’s and mean of the 
y¥" 8. 
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When conveniently grouped data are to be used the following is 


appropriate: 
Yo r?) 


where p equals the population within a segment of area (class), and 
r is the distance from the mid-point of the segment to the point. 


b. Median Center 


Borrowing from the median’s property of dividing a population into 
two equal frequencies, a number of investigators defined the median 
point for an areally distributed population as the intersection of two 
orthogonal axes each of which halved the population. However, if an- 
other orientation of the axes were employed, another “median” point 
would emerge if the population were not symmetrically distributed about 
a central point.* D. E. Scates? and others have noted this limitation. 
(Measures of dispersion based on this definition also suffer the same 
limitations. ) 


llowever, another property of a median can be successfully used to 
define a median center for a spatially distributed population. For the 
conventional frequency distribution the median is that value of the 
variable about which the algebraic sum of the deviations is a minimum. 
For spatial distributions, however, all deviations are distances and 
are to be regarded as positive without loss of rigor. The median cen- 
ter, then, is the point at which the sum of population times distance 
is a minimum as defined by the following: 


frp dA; 


or for grouped data, i (pr). 
c. Modal Center 


The third average familiar in conventional statistics is the mode, 
It is a local maximum on the frequency curve. As stated before, density 
of population replaces frequency for spatial distributions, and the 
modal center becomes the position of the high point on the “smoothed’”’ 
density surface. From inspection of the frequency distribution, one 


4For an areally distributed population exhibiting circular symmetry about 
a central point, the foregoing and subsequently discussed average posi- 
tions would of course all coincide. When this symmetry is lacking, the 
various computed centers are different and require individual inter- 
pretation. 


Sp, E. Scates, “Locating the Median of the Population in the United 
States,” METRON, 11 (1933), pp. 49-66. See also E. E. Sviatlovsky and 
W. C. Eells, “The Centrographical Method and Regional Analysis,’ GEO- 
GRAPHICAL REVIEW, 27 (1937), pp. 240-254. 
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can often obtain an approximate value for the mode. Jlowever, the pre- 
cise value of the mode can be determined only by means of the mathe- 
matical formula which describes the continuous curve of “closest possible 
fit’’ for the given frequency distribution. For spatial analysis, the 
problem is to determine the formula which describes the continuous den- 
sity surface of “closest possible fit’’ for a given areal distribution. 


d. Harmonic Mean Center 


For the areally distributed population, the harmonic mean center 
can be conceived of as that position where the reciprocal of the sum 
of the reciprocals of the distances to units of the population is at 
a minimum.’ Thus one has located the harmonic mean center when the 
position for the minimum value of the following expression has been 


found: 
1 


fr dA. 
4 


] 


: (p/ r} 


or for grouped data, 





It should be noted that, unlike the mean center, a unique position 
cannot be found for the harmonic mean center using an arbitrary origin 
for x and y coordinates. 


e. Other Centers 


It is now apparent that a number of other spatial centers can be 
defined by using other functions of distance. For example, one might 
define a geometric mean center to be located at the position where the 
antilog of the arithmetic mean of the logarithms of distances to units 
of the population is a minimum. Of course “antilog’’ in the above def- 
inition is superfluous for defining this center. Its position is where 
the minimum of the following expression occurs: 





®Each unit of the population occupies finite area however small. Thus, 
zero distances do not appear. For linear statistics this is not so and 
the harmonic mean and arithmetic mean do not coincide for a symmetrical 
frequency distribution. The computation for the harmonic mean center is 
closely related to that for “potential of population.” (See John Q. Stewart 
and William Warntz, “Physics of Population Distribution, ’’ JOURNAL OF 
REGIONAL SCIENCE, 1 (1958), pp. 99-123.) The harmonic mean distance of 
the population from a point is the reciprocal of the potential of pop- 
ulation there divided by the total population. The positions of the peak 
of potential and the harmonic mean center coincide. 
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1 
Sizoa 
log r 
or y P 
log r : 


No attempt will be made here to exhaust all other possible defini- 
tions for centers. 











f. Standard Distance Deviation 


In the above, the location of average positions was defined in 
terms of minima of certain quantities. These quantities can be further 
utilized to compute the measures of dispersion so necessary to supple- 
ment average positions. 


When the minimum value of = (pr2), which defines the mean cen- 
ter, is divided bythe total population, the square root of the result- 
ing mean square corresponds to the standard deviation of conventional 
statistics and thus is a measure of spatial dispersion. Stewart and 
Warntz introduced this measure as the dynamical radius of a population. 
We may also identify it as the standard distance deviation, a special 
case of the root-mean-square distance deviation which can be computed 
for any point in the area. When applied to values at the mean center 
the following formula defines this standard distance deviation: 


=( pr?) 
Ss = a 
' P 


where P, of course, is the total population. 
g. Mean Distance Deviation 


Dividing the minimum value of 2 (pr) for the median center by 
the total population gives another measure of dispersion, the mean dis- 
tance deviation which corresponds to the mean deviation of conventional 
statistics. It measures the arithmetic mean distance of units in the 
population from the median center and is computed from the following 
formula: 


_2 (pr) 
rp 





md 





"John Q. Stewart and William Warntz, “Macrogeography and Social Science,” 
GEOGRAPHICAL REVIEW, 48 (1958), pp. 167-184. The dynamical radius was 
tabulated for every United States Census. It was used to measure sys- 
tematically the expansion of the United States population and was based 
isomorphically on the relations between kinetic and dynamic energy. 
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h. Harmonic Mean Distance Deviation 


The harmonic mean distance deviation of a spatially distributed 
population may be found by dividing the quantity > (P/r) by the total 
population and taking the reciprocal of the result. This yields the 
harmonic mean of the distances of the units of the population from its 
harmonic mean center. The formula is: 


] 


> (P/r) 
P 


h = 
r 


or simply, 
P 
= (P/r) 


i. Generalized Measures of Dispersion 


Each of the measures of dispersion indicated in the foregoing sec- 
tions can be computed for any point in the area. - Each measure is at a 
minimum about its appropriate center, which is defined by that minimum. 
It is for such measures that the label “measure of dispersion” has 
generally been reserved. Nevertheless, these measures computed about 
other points are indicative of the dispersion of the population about 
those points. However, more inclusive measures, based on the popula- 
tion’s intrinsic spread and independent of any particular reference 
point, can be computed. 


For example, one can compute the value of the mean distance be- 
tween the members of all possible pairs in the population. This is 
equivalent to a mean distance deviation averaged over the whole dis- 
tribution and can be called the general mean distance deviation. Its 
formula is: 


= lp = (pr) J 
p2 





MD. = 
r 


The harmonic mean distance deviation similarly can be generalized 
to the general harmonic mean distance deviation as follows: ‘ 


p2 
H. = 
r= [pz (P/r) ] 








8the identical measure was introduced by John Q. Stewart in “ Demographic 
Gravitation: Evidence and Applications,’’ SOCIOMETRY, 11 (1948), pp.3l- 
58, in connection with the “demographic energy’’ of the United States 
population, 


WARNTZ AND NEFT: SPATIAL STATISTICS 93 





However, for the general standard distance deviation, such la- 
borious procedures are notnecessary. Yule and Kendall have demonstrated 
that the root-mean-square deviation averaged over a frequency distri- 
bution is equal to its standard deviation mltiplied by the square root 
of two.” One can readily generalize their proof to include areal dis- 
tributions and show that the general standard distance deviation is 
equal to the standard distance deviation times the square root of two, 
Therefore the general standard distance deviation becomes: 


Recently a number of investigators have utilized a measure of dis- 
persion based on mean distance between nearest neighbors. 0 Although 
relationships have been carefully worked out for distances and proba- 
bilities, such a measure has two disadvantages when compared to the 
general measures above. It cannot be computed when only grouped data 
are available and, in addition, each unit of the population is not 
taken in conjunction with every other unit. 


j. Skewness 


For a frequency distribution, the Pearsonian measure of skewness 
relating to the asymmetry ofalinear distribution is commonly employed. 
For the spatial distribution the counterpart is found by dividing the 
distance between the mean center and the modal center by the standard 
distance deviation. The following formula applies: 


r 
mo m 





where a * is the distance between the mean center and the modal cen- 


ter. Of course no algebraic sign is attached to this measure, but one 
may speak of the direction of skewness, its bearing being that of the 
line connecting the modal center and the mean center. 


k. Inequalities 


A useful rule concerning the proportion of a population falling 
outside stated limits for any type of frequency distribution can be 
applied to spatial distributions as well. The Tchebycheff Inequality 
states that the fraction of the total population lying beyond the range 
established by the arithmetic mean plus and minus a given number of 
standard deviations will be equal to or less than the reciprocal of 





9G. U. Yule and M. Kendall, AN INTRODUCTION TO THE THEORY OF STATISTICS, 


New York, (1950) p. 147. 
l0this measure was introduced by Philip J. Clark and Francis C, Evans 
in “Distance to Nearest Neighbor as a Measure of Spatial Relationships 
in Populations, ’’ ECOLOGY, 35 (1954), pp. 445-453, using illustrations 


drawn from the areal distributions of botanical species. 
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the square of this number of standard deviations. For spatial dis- 
tributions this means that the fraction of the population lying outside 
a circle centered on the mean center and with a radius equal to a given 
number of standard distance deviations, will be equal to or less than 
the reciprocal of the square of the number of standard distance devi- 
ations. The following formula applies: 


] 


9 


k¢ 


F< 
p= 





where F is the fraction of the total population and k is the number 


of standard distance deviations. 


Obviously this is a very crude measure and the estimate can be 
much improved with more exact knowledge about the precise form of the 
distribution. In fact, considerable improvement can be made if it is 
known that the distribution may be regarded as unimodal and moderately 
skewed. In this case one can utilize the Camp-Meidell Inequality em- 
ploying the factor 2.25 in the denominator. 


The value of the Tchebycheff theorem lies in the fact that it is 
completely general and may be used for any spatial distribution, pro- 
vided only that the standard distance deviation can be computed. 


3. MOMENTS 


The various measures of dispersion about average positions dis- 
cussed earlier are related from a formal mathematical point of view. 
They all involve p, P, and r. They differ only in the exponent of r 
employed. Their formulas are analogous to those of a series of sta- 
tistical moments. In deriving the mean distance deviation the first 
moment was employed; for the standard distance deviation the second 
moment was used, The harmonic mean distance deviation was computed 
from what might be called an “inverse first moment.’’ Obviously, third, 
fourth, and higher moments can readily be computed. 


The various moments can be computed about any point, not just about 
the average positions which their minima define. Let the following 
formula apply to the computation of any moment about any point: 

' 


> (pr") 


where n = the exponent of distance and thus the number of the moment. 
We shall use m, to designate the n*» moment about the particular cen- 


ter defined by the minimum value of that moment. 


For the symmetrical areal distribution each moment has its mini- 
mum value at the one central point. But, since deviations for spatial 
distributions are all positive distances, certain of the properties of 
moments of a frequency distribution are not retained by their spatial 
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counterparts. For example, the spatial first moment (indeed all odd 
moments) can never have a zero or negative value. (It will be recalled 
that the minimum value of this first moment defines the median center. 


. . ’ 
The second moment retains its features and m5 can be shown to be 


functionally related to m, thanks to Pythagoras. (m, = m, - r*, where 


r is the distance between the two points.) The value of m, may also 


be called the distance variance of the population as it is the square 
of the standard distance deviation. 


The ratio of m, to (m,)?, when each is computed about the mean 


center, may be regarded as an indication of the peakedness (kurtosis) 
of the distribution. 


Since distance is infinitely divisible and since moments can be 
computed about every point in the area, these moments then represent 
spatially continuous values despite the fact that they are derived from 
a population comprised of a finite number of discrete individuals. 
Thus, from a microgeographic distribution of a population, macrogeo- 
graphic variables can be created. 


When values of a moment have been computed for enough points in 
an area, a map of the area can be drawn showing lines connecting points 
of equal value. These contour lines will appear as concentric circles 
for all moments for a symmetrically distributed population. Only the 
second moment, however, maps as concentric circles regardless of the 
microgeographic distribution. For all other moments the contours de- 
part from concentricity to meet the requirements of the particular 
distribution. 


4. DENSITY SURFACES 


Space need not be taken here to discuss extensively the role of 
the normal curve of error in conventional statistics. It will suffice 
to call attention to its importance as a model of a theoretical distri- 
bution with which actual distributions can be compared. This curve, of 
course, represents only one of a number of mathematical families to 
which frequency distributions arising in practice can be assigned. The 
assignment of such a curve, however, is not a matter to be entered into 
lightly and one must understand fully the advantages and limitations of 
allowing one particular set of compact values todescribe a distribution. 


It is possible in similar fashion to develop a family of density 
surfaces to describe the distributions of populations over area. The 
normal correlation surface or the bivariate normal distribution, while 
not dealing specifically with the areal distribution of one population, 
can be applied to such a distribution when two orthogonal x and y axes 
are employed. The form of this model is wholly dependent on the selec- 
tion of arbitrary axes. As stated before, we wish to direct attention 
to models in which axes need not be selected and in which deviations 
are considered as radial distances. 
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As an example we present what we shall call The Probability Density 
Surface, This model has the feature that all cross sections through the 
mean center exhibit the same profile. Thus any section parallel to one 
of the above cross sections may be derived from that cross section by 
reducing all the density values of that central section by a constant 
ratio. 


We define density at any distance out from the mean center (regard- 
2,2 
less of direction, D., as equal to the central density,D,, times e J /s* 


where r and s, are defined as previously. Density at a point is ambig- 


uous, but is a useful mathematical abstraction. It is intended that 
this density function measure the number of units of the population in 
a unit of area centered on a point when this unit area is a square with 
side equal to one unit of distance. It is also required that the popu- 
lation be large enough and that the units of area be small in relation 
to the total area so that the density distribution may be regarded as a 
continuous surface. 
P P r2/s2 
For this surface, D. = > + ‘thus, D, = e~ * is the equa- 


oO 2 2 
77 § 77 s 
r r 








tion for Probability Density Surface. 


As early as 1892, the prodigious William Woolsey Johnson called 
attention to a surface of probability in discussing the deviations of 
the bullet marks in target practice from the target aimed at, as acci- 
dental in nature. 


The probability surface is a surface of revolution analagous to 
the curve of probability in the case of linear errors. 


Although this is a probability model, one may also consider it as 
representing an areally distributed population, by regarding the total 
population as equal to certainty and by equating the total wlume under 
the surface with the total population. 


Given here for the first time (Table I) is a table of densities for 
the Probability Density Surface. This table is entered with the value 
of the distance from the mean center divided by the standard distance 
deviation. Thedensity at any given distance out is stated as a decimal 
fraction of the maximum or central density. 


Another table presented for the first time (Table II) for such a 
spatially distributed population is given below as a table of volumes 
under the probability density surface, where, as indicated earlier, the 
total population is equated with the volume. 





lly. Ww. Johnson, THE THEORY OF ERRORS AND METHODS OF LEAST SQUARES, 
New York (1892), 172 pp. Detailed proofs are found here. 
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An interesting feature of Table II isthe fact that at any distance 


—-, the value on this table representing the decimal fraction of the 
Ss, 

total population included within a circle of such a radius is the com- 
plement of the value for the density to be found at that same distance 
out. That is to say, one minus the value in the density table yields 
the value in the volume table. 


To find the decimal fraction of the total population located within 
a circle centered on the mean center and with a radius of a given dis- 
tance, one enters the table with the ratio of this distance to the 
standard distance deviation. It will be observed that 0.63212 of the 
population is found within a distance equal to one standard distance 
deviation; 0.98168 within two; and 0.99988 within three. The circle 
within which half the population is located has a radius equal to 
0.833s_. In terms of probabilities this represents the circle within 
which a unit of the population is as likely as not to be found, and for 
which Johnson has suggested the term probable circle, Its radius is 


the probable error of the distribution and, as such, is a measure of 
dispersion. 


Of course the maximum density is Do» but the ring of a given ele- 


mentary width dr (the elementary annulus) which contains the greatest 
number of units of the population is located at a distance equal to 


Ss 





from the mean center. In probability terms, this can be considered 


v2 


the most probable distance and is still another measure of dispersion. 


For this model of geographical distribution the value of m4 divided 
by (m,)*, 1.e., the kurtosis, is 2. 


The model discussed above is only one of a possible family of sur- 
faces with an exponential drop off of density. It is often assumed for 
areal distributions (if orthogonal axes again be introduced) that devi- 
ations in x and deviations inyare independent of each other. Put, the 
hypothesis that each of these deviations is independent, with each fol - 
lowing the usual normal curve of error, leads to the conclusion that the 
resulting distribution depends upon r, and not upon direction. While 
there may be no a priori reason why x and y are to be considered as 
independent, it is important to note that no other model would produce 
such an internally consistent set of measures based only upon the dis- 
tance from the center. This is easily seen, for example, from a con- 
sideration of m,. Any other density surface involving another expo- 


nential drop off would require that the units of a given total population 
be moved either all inward toward or all outward from the mean center. 
Such shifting of course changes the value of m, from which S. 1S com- 
puted. Internally consistent tables of density and volume can not be 
constructed, as an assumed Ss. for one table leads to a different (but 
relatable) computed s_ from the other table. 
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TABLE 1 
Densities of the Probability Density Surface 


r . . . 
At Distances —— from the Mean Center, Expressed as Decimal Fractions of the Maximum 
Ss 
r 


Central Density D 


o 








P 
The central density (i.e. maximum density) is computed from the expression D, = 
9 
18. 
5 ’ a r?/s2 
The values tabled below result from solving the expression e r 
T/s .00 01 02 03 04 .05 . 06 07 .08 .09 


r 





0 1.00000 .99990 .99960 .99910 .99840 .99750 .99640 .9951]1 .99362 .99193 
] .99005 .98797 .98570 .98324 .98059 .97775 .97472 .97151 .96812 .96454 
2 .96079 .95686 .95275 .94848 .94403 .93941 .93463 .92969 .92459 .91934 

3 .91393 .80837 .90267 .89682 .89083 .88471 .87845 .87206 .86554 .85890 
4 .85214 .84527 .83828 .83119 .82399 .81669 .80929 .80180 .79422 .78655 


ooo °c oO 


0.5 .77880 .77097 .76307 .75510 .74707 .73897 .73081 .72260 .71434 .70603 
0.6 .69768 .68929 .68068 .67240 .66392 .65541 .64686 .63833 .62977 .62120 
0.7 -61263 .60405 .59547 .58690 .57834 .56978 .56124 .55272 .54422 .53574 


0. .52729 .51886 .51048 .5021] .49381 .48554 .47730 .46912 .46098 .45289 
0.9 -44486 .43688 .42896 .42109 .41329 .40555 .39788 .39028 .38274 .37527 
1.0 .36788 .36056 .35331 .34614 .33905 .33204 .32511 .31825 .31149 .30480 
A .29820 .29168 .28525 .27890 .27264 .26647 .26038 .25439 .24848 .24266 
is: . 23693 .23129 .22573 .22027 .21490 .20961 .20446 .19931 .19429 .18936 
1.3 .18452 .17977 .17510 .17052 .16603 .16162 .15730 .15306 .14891 .14484 
1.4 .14086 .13696 .13313 .12939 .12573 .12215 .11865 .11522 .11187 .10860 


1.5 .10540 .10227 .09922 .09624 .09333 .09049 .08772 .08502 .08238 .0798] 
1.6 .07730 .07486 .07248 .07017 .06791 .06571 .06357 .06149 .05946 .05749 
1.7 .05558 .05371 .05190 .05014 .04843 .04677 .04516 .04359 .04207 .04060 
1.8 .03916 .03778 .03643 .03512 .03386 .03263 .03144 .03029 .02918 .02810 
1.9 .02705 .02604 .02506- .02412 .02320 .02231 .02146 .02063 .01983 .01906 
2.0 .01832 .01760 .01690 .01632 .01558 .01496 .01436 .01378 .01322 .01268 
yh .01216 .01165 .01117 .01071 .01026 .00983 .00941 .00901 .00863 .00826 
oe .00791 .00757 .00724 .00692 .00662 .00633 .00605 .00578 .00553 .00528 
2.3 .00504 .00481 .00460 .00439 .00419 .00400 .00381 .00364 .00347 .00331 
2.4 .00315 .00300 .00286 .00273 .00260 .00247 .00235 .00224 .00213 .00203 
2.5 .00193 .00184 .00175 .00166 .00158 .00150 .00143 .00135 .00129 .00122 
2.6 -00116 .00110 .00104 .00099 .00094 .00089 .00085 .00080 .00076 .00072 
A | .00068 .00065 .00061 .00058 .00055 .00052 .00049 .00047 .00044 .00042 
2.8 .00039 .00037 .00035 .00033 .00031 .00030 .00028 .00026 .00025 .00024 
2.9 .00022 .00021 .00020 .00019 .00018 .00017 .00016 .00015 .00014 .00013 
3.0 .00012 .00012 .00011 .00010 .00010 .00009 .00009 .00008 .00008 .00007 
3.1 .00007 .00006 .00006 .00006 .00005 .00005 .00005 .00004 .00004 .00004 
A -00004 .00003 .00003 .00003 .00003 .00003 .00002 .00002 .00002 .00002 
ie .00002 .00002 .00002 .00002 .00001 .00001 .00001 .00001 .00001 .00001 
3.4 .00001 .00001 00001 .00001 .00001 .00001 .00001 .00001 .00001 .00001 
ao .00000 * 


* This value approaches but never actually reaches zero. It is shown as such here 
only because the values on this table are rounded to five decimal places. 
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TABLE 2 


Volumes under the Probability Density Surface 


r 
Population Included Within a Circle Having Radius —— with the Mean Center 


Ss 
r 


as Origin, Expressed as a Decimal Fraction of the Total Population. 








r/s. .00 .01 . 02 .03 .04 .05 06 .07 . 08 09 
0.0 .00000 .00010 .00040 .00090 .00160 .00250 .00360 .00489 .00638 .00807 
0.1 .00995 .01203 .01430 .01676 .01941 .02225 .02528 .02849 .03188 .0354¢ 
0.2 .03921 .04314 .04725 .05152 .05597 .06059 .06537 .07031 .07541 .0806¢ 
0.3 .08607 .09163 .09733 .10318 .10917 .11529 .12155 .12794 .13446 .14110 
0.4 .14786 .15473 .16172 .16881 .17601 .18331 .19071 .19820 .20578 .21345 
0.5 .22120 .22903 .23693 .24490 .25293 .26103 .26919 .27740 .28566 .29397 
0.6 .30232 .31071 .31914 .32760 .33608 .34459 .35314 .36167 .37023 .37880 
0.7 .38737 .39595 .40453 .41310 .42166 .43022 .43876 .44728 .45578 . 46426 
0.8 .47271 .48114 .48952 .49789 .50619 .51446 .52270 .53088 .53902 .5471] 
0.9 .55514 .56312 .57104 .57891 .58671 .59445 .60212 .60972 .61726 .62473 
1.0 .63212 .63944 .64669 .65386 .66095 .66796 .67489 .68175 .68851 .69520 
1.1 .70180 .70832 .71475 ..72110 .72736 .73353 .73962 .74561 .75152 .75734 
1.2 .76307 .76871 .77427 .77973 .78510 .79039 .79554 .80069 .80571 .81064 
aaa -81548 .82023 .82490 .82948 .83397 .83838 .84270 .84694 .85109 .85516 
1.4 .85914 .86304 .86687 .8706]1 .87427 .87785 .88135 .88478 .88813 .89140 
1.5 .89460 .89773 .90078 .90376 .90667 .90951 .91228 .91498 .91762 .92019 
1.6 -92270 .92514 .92752 .92983 .93209 .93429 .93643 .93851 .94054 .9425] 
1.7 -.94442 .94629 .94810 .94986 .95157 .95323 .95484 .95641 .95793 .95940 
1.8 .96084 .96222 .96357 .96488 .96614 .96737 .96856 .96971 .97082 .97190 
1.9 -97295 .97396 .97494 .97588 .97680 .97769 .97854 .97937 .98017 .98094 


0 -98168 .98240 .98310 .98377 .98442 .98504 .98564 .98622 .98678 .98732 
] -98784 .98835 .98883 .98929 .98974 .99017 .99059 .99099 .99137 .99174 
2 -99209 .99243 .99276 .99308 .99338 .99367 .99395 .99422 .99447 .99472 
3 -99496 .99519 .99540 .99561 .99581 .99600 .99619 .99636 .99653 .99669 
4 .99685 .99700 .99714 .99727 .99740 .99753 .99765 .99776 .99787 .99797 


NO NO NM NN NO 


a2 .99807 .99816 .99825 .99834 .99842 .99850 .99857 .99865 .99871 .99878 
2.6 -99884 .99890 .99896 .99901 .99906 .99911 .99915 .99920 .99924 .99928 
2.7 .99932 .99935 .99939 .99942 .99945 .99948 .9995] .99953 99956 .99958 
2.8  .99961 .99963 .99965 .99967 .99969 .99970 .99972 99974 .99975 .9997¢ 
2.9  .99978 .99979 .99980 .99981 .99982 .99983 .99984 .99985 .99986 .99987 
3.0  .99988 .99988 .99989 .99999 .999990 .9999] .9999] .99992 .99992 .99993 
3.1 .99993  .99994 .99994 .99994 .99995 .99995 .99995 99996 99996 99996 
3.2  .99996 .99997 .99997 .99997 .99997 .99997 .99998 .99998 99998 .99998 
3.3 .99998 .99998 .99998 .99998 .99999 .99999 .99999 .99999 .99999 .99999 
3.4  .99999 .99999 .99999 .99999 .99999 .99999 .99999 .99999 .99999 .99999 


3.5 1.00000 * 


* This value approaches but never actually equals one. It is shown here as such 
only because the values on this table are rounded to five decimal places. 
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By way of example for the above statement let us consider another 
distribution, an imposter (though possibly useful) calling itself “The 
Normal Density Surface.’’ Here the distribution is such that its density 
surface is the surface of revolution of the conventional linear normal 
curve. Any cross section passed through the mean center would yield a 
profile like that of the normal curve. This appearance is an illusion, 
however. But to continue the illustration, let us assume that density 
at any distance out from the center can be defined as: 


9 
D =D e* r?/2s* 


c Oo ° 


and the central density is taken as: 


The table of densities for this surface would be precisely the 


r 
table of ordinates for the normal curve entered with the value of — 


Ss 
r 


The tables of volumes under the surface would not, however, contain the 
same values as the corresponding table of areas under the normal curve. 
The reason is that successive equal segments of length along a line wil] 
sweep out rings of successively larger areas when that line is rotated 
about its center point. The ratio of the areas follows an arithmetic 
progression. Consequently a new table of volumes has to be computed. 
This table, too, is the complementary companion of its appropriate den- 
sity table. The inconsistency of these tables is revealed, however, 
when one assumes values for P and s_ and actually distributes a popula- 


tion according to these tables. The actual root-mean-square distance 
deviation computed about the mean center for this distribution is equal 
to v2 times the originally assumed s,. In fact a given density 
from the Probability Density Surface table discussed earlier would 
appear in this “Normal Density Surface’’ table at “2 times the value 


r ' 
— on the Probability Density Surface table.! 


r 


n 





127, “Comparison of Estimates of Circular Probable Error,’’ JOURNAL OF 
THE AMERICAN STATISTICAL ASSOCIATION, Vol. 54, No. 288, Dec. 1959, pp. 
794-800, P. B. Moranda has based his estimates of the square root of the 
common variance (in xandy distances) of deviations of the impact point 
from the target center for weapons tests upon the function which defines 
the misleading density surface which we have called “The Normal Density 
Surface’? and which he claims is the density function for radial errors. 
It is to be noted that if one assumes Moranda’s value of 1.17740 as The 


Circular Probable Error (what we refer to as The Probable Circle) he 
cannot actually construct such a density surface using Moranda’s func- 
tion and have only 50% of the population included within the circular 


Probable Error so defined. 
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9. SOME PARAMETERS OF THE GEOGRAPHICAL DISTRIBUTION OF 
POPULATION IN THE UNITED STATES, 1950 


The Bureau of the Census reported that there were 150,697,361 per- 
sons distributed over the l/nited States as of April 1, 1950. (See 
figure 1). The mean center of this distribution was reported to be 
located in Richland County, Illinois, The computation of this par- 
ticular average position continued the Bureau’s tradition of findin 
the “balancing point”’ for the population following each census.’ 


In figure 2 is given a map recently prepared at the American 
Geographical Society, showing the contours of the second moment of 
this population. The minimum of this moment occurs at the mean center, 
and its contours necessarily map as concentric circles about that cen- 
ter regardless of the distribution. 


The map shown in figure 3 exhibits contours related to the first 
moment of population. The minimum of the values shown here defines the 
median center which is seen to be located near Indianapolis. Such 
contours do not map as concentric circles unless the population distri- 
bution has circular symmetry. But, like the second moment, any distri- 
bution can have but one minimum point and similarly, along any one 
constant bearing outward from this center successively larger values 
are encountered, The value for any point can be interpolated from 
neighboring contours and can be interpreted as indicating the aggregate 
travel distances required for all individuals to be moved to this point 
by shortest airline distance. Arelated map showing average (arithmetic 
mean) travel distance could be called a map of the first moment of pop- 
ulation. The map given here could readily be converted to such a map 
by dividing all values by a constant equal to the total United States 
population. 


The maps of the first and second moments, however interesting from 
the purely statistical point of view, have had little significance or 
demonstrated usefulness in problems in social science in which areal 
distribution is important. Contrasted with this sterility are the 
manifold applications of the contours of potential of population as 
displayed in figure 1. Considerable evidence has been amassed to 
demonstrate the significance of this set of contours as indicators of 
the spatial structuring of social and economic phenomena. Details 





] 


Tt is assumed that the Bureau of the Census will perform this compu- 
tation in conjunction with the 1960 census. Undoubtedly Alaskan and 
Hawaiian populations will be included, but it might also be advisable 
for the Bureau to compute the center disregarding these populations for 
obvious reasons. 
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cannot be discussed here,!* but one feature requires a brief mention -- 
the high correlation between population density and population potential. 
The peak of potential of population coincides with the modal center on 
the smoothed density surface for the United States. Nothing in the 
mathematics of potential of population or the harmonic mean, to which 
it is related, requires that the modal center and the major peak of 
potential coincide. Viewed only statistically, the agreement is coin- 
cidental. The explanation lies beyond the realm of mathematical sta- 
tistics and is to be found in a natural science of society. Figure 2 
shows the location of various centers of the United States population 
in 1950 and also circles centered on the mean center with radii equal 
to s, and 2s. 


Below are given the values of the various measures of dispersion 
computed for the 1950 population distribution: 


s. = 787 miles; 
618 miles; and 


h. = 158 miles. 


a 


The values of the generalized measures of dispersion independent 
of any particular origin are: 


Ss, 1113 miles; 
MD. = 924 mles; and 


Hi. = 420 miles. 


The skewness of the population was computed to be 0.959 when the 
modal divergence, found to be 755 miles, was divided by the s, given 


above. The value for kurtosis was computed as 1.92. (Compare this 
with the value 2 for the probability density surface. ) 


Approximately 69% of the total United States population in 1950 
lived within 1] s. of the mean center; 90% were found within 2 s.3 (see 


figure 2). When compared with similar values for the probability den- 





147, addition to the works cited in some of the above footnotes, inter- 
erested students are referred to the following by John Q. Stewart: “Empir- 
ical Mathematical Rules Concerning the Distribution and Equilibrium of 
Population, ’’ GEOGRAPHICAL REVIEW, 37 (1947), pp. 461-485; “The Develop- 
ment of Social Physics,” AMERICAN JOURNAL ‘of PHYSICS, 18 (1950), PP +, 
239-253; “Potential of Population and its Relationship to Marketing," 
in THEORY IN MARKETING, ol. by R. Cox and W. Alderson, Chica (1950), 
pp. 19-40; and “A Basis for Social Physics,’’ IMPACT, 3 (1952 pp- 110- 
133. An early statement of potential of po ulation can be found in Stew- 
art’s “The Gravity of the Princeton “fay ’* PRINCETON ALUMNI WEEKLY, 
40 (1940), pp. 409-410. See also the mod) pee A by William Warntz: “Prog- 
ress in Economic Geo raphy,’’ in NEW VIEWPOINTS IN GEOGRAPHY, ed. by 
Preston E. James, Washington, D. C. (1959), pp. 54-75; _ aeeaneny 
and the Census,’’ PROFESSIONAL GEOGRAPHER, 10 (1959), pp. 6-10; “Geogra- 
phy at Mid-twentieth Century,’’ WORLD POLITICS, 11 (1959), pp. 442-454; 
TOWARD A GEOGRAPHY OF PRICE, Philadelphia (1959), 117 pp. and “A Macro- 
geographer Takes a Hard Look at College Enrollments,” PRINCETON ALUMNI 
WEEKLY 59 (1959), pp. 8-13. 
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sity surface, the greatest difference occurs with regard to the number 
of people located at distances of from 1to2 s,’s from the mean center. 
The probability density surface requires that approximately 35% of 
a population be found here as contrasted to the 21% so located. 

1s gives a quantitative indication of the deficit of population in 
the Rocky Mountains and Great Plains. Although this deficit is seem- 
ingly related to the physical environment, a more meaningful approxi- 
mation results from studying the relationship between potential of 
population and population density. 


The study of the variation through time of each of the measures 
discussed in this paper gives definitive evidence of a continuing pri- 
mary pattern of population distribution in the United States. As an 
example it can be shown that since 1840 when New York City emerged as 
the modal center and principal peak of potential of population, the 
skewness of the population has been extremely close to unity. The 
maximum value was computed as 1.05 and the minimum as 0.95. In other 
words, the westward movement of the mean center along the 39th paral - 
lel of latitude has always been accompanied by the appropriate increase 
in the dynamical radius. 


6. CONCLUSION 


We have attempted to show how currently accepted statistical con- 
cepts and methods may be extended to enable investigators to analyze 
and classify spatial distributions. If these conceptual developments 
can increase the rigor with which such analysis can be made, then per- 
haps some of the deficiencies in present spatial studies, particularly 
with regard to the testing of theories and hypotheses, may be remedied. 


However, there remains a great deal to be done. Applications of 
statistical methods to areal distributions are still few, and those 
methods themselves are yet at an early stage of development. The 
demographic characteristics of the-United States are only a small 
segment of an immense number of types of phenomena which could be 
investigated in this manner with rewarding insights and conclusions. ° 
While it is hoped that the present paper will be immediately useful, it 
is especially desired that it will stimulate investigation in a subject 
too long neglected. 6 





\Spor further discussion see Warntz, W., MACROGEOGRAPHY AND THE UNITED 
STATES, now in preparation, 


164 number of persons have already given incalculable aid, but having 
done so, must not be considered as accessories before the fact of any 
errors subsequently found here. Our continued thanks and sense of in- 
debtedness go to John Q. Stewart, Associate Professor of Astronomical 
Physics, Princeton University. 0. M. Miller, Assistant Director of the 
American Geographical Society, N. Y.C., has provided key references and 
listened patiently. William Briesemeister, Douglas Waugh, and Alexander 
Bobrovsky of the society’s cartographic staff provided illustrations. 
Many of the computations have been performed by Robert Harris and John 
Macisco of the instructional staffs of Columbia University and Fordham 
University, respectively. 
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GENERAL INTERREGIONAL EQUILIBRIUM* 


by Walter Isard and David J. Ostroff* 


1, INTRODUCTION 


In this brief paper we reformulate and further develop certain mat- 
erials on ageneral equilibrium model of the Walrasian type for an inter- 
regional economy.* The notation follows that of existence theorems in 
Arrow and Debreu and in Isard and Ostroff.? 


1.0. Let there be U one-point regions (L=],..., U) and 4 commodi- 
ties (h=1,..., 2). For notational convenience alone, posit in each re- 
gion: (1) n producers (j=l,...,n); and (2) m consumers (i=],...,m) each 
of whom possesses initial holdings of commodities as given by the vect- 
or % (ch E Re where R’ is the Euclidean space of £ dimensions) any 


L 


component GF ; representing the initial stock of commodity h held by 


consumer i in region L.? Let there also be one world trader, free to 
ship any non-negative amount of conmodity h,(h=1,...,7) between any two 


regions J and L (J,L=1,...,U; JAL). 





The authors wish to acknowledge support of their research by a grant 
from Resources from the Future, Inc. They are also indebted to Benjamin 
H. Stevens for fruitful comments and suggestions, though they alone are 
responsible for any shortcomings of the analysis. 


“The authors are respectively Professor of Regional Scienceand Assistant 
Instructor in Mathematics at the University of Pennsylvania. 
lParticulerly those contained inW.Isard, “General Interregional Equili- 
brium,’’ PAPERS AND PROCEEDINGS OF THE REGIONAL SCIENCE ASSOCIATION, Vol. 
III, 1957, pp. 35-60. 


2k. J. Arrow and G. Debreu, ‘‘Existence of an Equilibrium for a Competi- 
tive Economy,’’ ECONOMETRICA, Vol. XXII, July 1954, pp. 265-290; and W. 
Isard andD.J. Ostroff, “Existence of a Competitive Interregional Fquili- 
brium,’’ PAPERS AND PROCEEDINGS OF THE REGIONAL SCIENCE ASSOCIATION, Vol. 
IV, 1958, pp. 49-76. 


3an alternative is to treat the components of 4, as variables, and to 
rL I I 
h,i ] 


introduce below supply functions, (p-,...+,p-) where h=] f. 
As 
»eee,ym; and L=],...,U which govern the amounts of commodity (resource) 


1=] 


h, (1-€. 4s ;), supplied by consumers. By so doing, we would increase 
’ 


both the number of variables and equilibrium conditions, as described 
below, by Um. For further discussion, see Isard, op. cit., pp-.27-44. 
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2. VARIABLES IN THE SYSTEM 


The variables of the system are as follows: 
(1) For each producer the inputs and outputs of all commodities 
as given by the vector y; € RY (jel,...,n;Le#l,...,U) any component yk 


being negative when the commodity h is an input and positive when h is 
an output. Over all producers, these variables are Unf in number. 


(2) For each consumer, the final demand for all commodities as 
given by the vector x € RE (i=1,...,m; Lel,...,U) any component of 


which xh represents final demand for the commodity h (h=1,...,4%). 


Over all consumers these variables are Um in number. 


(3) For the world trader, the shipment of all commodities between 
any combination of originating region L (L=l,...,U) and terminating re 
gion J (J=],...,U; J#L), as given by the non-negative vector ¢* “ € Rt 
where the component * refers to the shipment of commodity h. There 


are U(U-1) combinations of such regions, and therefore U(U-1) such 
vectors. The number of variables ---i.e. components of such vectors--- 


is U(U-1)4. 
(4) For each region, the set of internal (market) prices which 
can be represented by the non-negative price vector pL € R” where the 


component ph represents the price of commodity h. Over all regions, 


there are (U{-1) variable price ratios tobe determined since the system 
is homogeneous in prices. 


(5) For each region, an asset transfer Q’ (which can be either 
positive or negative), necessary toeffect a balance of payments. Since 


) Q- = 0 there are only (U-1) of such variables which are independent. 
L 


3. EQUILIBRIUM CONDITIONS 
We have the following equilibrium conditions: 
(la) For each producer, his production vector must be contained 


within the space defined by a production function (representing techni- 
cal conditions): 


L L ZZ L - 
Pj (71, 59 ¥2,5000° Me, 5) © O jel,--+n (la) 
L=l 


having continuous partial derivatives. Over all producers in all re- 
gions, there are Un such conditions. 


(1b) His production plan must meet the following efficiency con- 
ditions: 
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uF ’ 

rej Ph hel,...,2; hgh 
L pe iy j=l,...,n (1b) 
Pr! Ph’ Oe 


where fh and tke ; are the partials of the function p with respect 


to Yh and Yn" j respectively. When outputs are specified in net terms, 


there are Un(4-1) of such independent conditions for the interregional 


economy. 


It should be noted here that for ease of presentation, it is as- 
sumed that none of the components of a producers’ equilibrium produc- 
tion plan is zero, i.e. that some of each commodity is either produced 
or used as an input. A more realistic assumption can be made, namely, 
that any producer is concerned with the selection of non-zero amounts 
of only a limited number of commodities which commodities are specified 
beforehand. For any pair of such commodities, relations (lb) must hold. 
If neither of these assumptions is made, inequalities and “do not exist” 
conditions must be considered. 


(2) For each consumer, there is assumed an ordinal utility func- 


tion: 


. L , ’ 
u- ~ (xh 5s KE, inceee eX a $01 cast 


ai ” ae 





oul 
ub OT 
h,i 


The equilibrium condition that ub (i=l,...,m; L=],...,U) be a maximum 


subject to budget balance:4 





ph. (xk - rly = 9, (2a) 
is: 
Ubi Ph 
A nll h=1,...,2; hgh’ (2b) 
Unti Ph 





4To avoid additional notation, we posit that entrepreneurial (risk-taking, 
management, etc.) services represent a commodity of which each indivi- 
dual may hold a stock. The price of this commodity is taken to be cor- 
related with a “normal” profit or return. Thus, profits resulting from 
production and export activities are payment for entrepreneurial service 
which payments are included in the left-hand side of eq. (2a). The 
reader, however, may wish to treat profits explicitly as is done in 
Isard and Ostroff (op. cit.) 
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Over al] consumers over all regions, there are Um equations (2a) and 
Um(4-1) independent equations(2b), on the assumption that at equilibrium 


xb> Q, If for commodity h, we allow xb ; =O at equilibrium, inequalities 


and “do not exist’’ conditions must be considered. 


(3) For the world trader, there is defined a “gains from trade” 


function: 
- i IJ 
et De: mu. # (3a) 
. ws 


L¢J 
where TES Rt and for any hth component (h=],...,4) 


Th” = Ph > Ph > Phi de” % (3b) 


In (3b), h’ refers to the commodity transportation service (transport 
inputs), its price Py’ being the transport rate at L;é*? is a predeter- 
mined number indicating distance from region L to region J, and w, is 


another predetermined number indicating the ideal weight (a la Weber) 
of commodity h. The last term on the right-hand side of equation (3b) 
represents the cost of transporting a unit of commodity h from region L 


to region J.° 


Assuming that the world trader has no control over prices and any 
“* (h=1,....4;L,J=1,...,U;L4J), | and thus views these as constants®, 


the equilibrium condition that G, a linear function in a be a maxi- 
mum, iS: 





Note that we adopt the convention that the world trader purchases the 
necessary transportation service to effect any export from the local 
market of the region of export. The region of export, however, may re- 
ceive (via import) supplies of transportation services produced by units 
in other regions. Other conventions are obviously possible; they would 
not require any basic changes in our statement. Observe, too, that the 
world trader requires as inputs only transport services and the commodi- 
ties to be shipped; his outputs are the commodities delivered to the 
regions of import. 


SOur fiction of a world trader is analogous to the fiction of a market 
participant in Arrow and Debreu (op. cit.),. The market participant, 
who sets price but views z~ (to be defined below) as a set of constants, 
reflects the operation of the competitive mechanism on the local market. 
Our world trader reflects the competitive mechanism as it governs trade 
among regions. 
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md. gk 2g h=l,...,/ ( 3c) 
h h L,J=1,...,U;L4J 


It is clear that for any 7L > O, an equilibrium cannot exist. At an 
Y 7h q 

equilibrium point, whenever rte J>o 7 - O, i.e. whenever there 
} ’ h , h 


are positive shipments, the price spread for any commodity h between any 
two regions J and L must equal the cost of. y 5-9 lere a unit of com- 
modity h from L to J; and whenever J< O, 4h Se i.e. whenever the 


price spread is less than this cost of il shipments of 
commodity h are zero, although shipments may be zero when the price 
spread is just equal to this transportation cost. 


The number of relations (3c) is U (U-1)/, 


(4) Following the traditional procedure of excluding from our 
framework zero-price commodities, we have for each region the following 
supply equals demand relationships: 


> my + > sk « p> 4 + > yi + > yr (4a) 
i J i j J 
JAL JAL 


In (4a) s/ is equivalent to givs except for the h’ (transportation 
service) component. We define 


” - (4b) 
si) = fy + dl (gl + w) 

h h 

where the vector w € K with components Wh (h=1,...,4). For all re- 


gions, we have Uf supply = demand conditions. 


(4') As an alternative to 4, let there be in each region a ficti- 
tious market participant (4 la Arrow and Debreu) who desires to maxi- 


mize pL .zl where 


CeCe Ce Le Let ws 
i i j 
JAL JA. 


Since the market participant has no control over any st (h=],...,4) and 
thus views these as constants, but is free to select pl( pte RY, pk > O), 


the condition that p’.z’ be a maximum is: 


<a «0 (h=1,...,%) (4b)' 
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It is clear that for any zh > O, an equilibrium cannot exist. At an 
equilibrium point, (]) whenever pt > O, zt = O, i.e. supply equals de- 
mand for all commodities with positive price; and (2) whenever zi < O, 


ph = O, i.e. whenever the demand for a commodity falls short of avail- 


able supply, price is zero, although price may be zero for a condition 
of equality between demand and supply. In this manner free goods may 
be incorporated into the framework of our interregional economy, although 
to do so would require reformulation of conditions (lb) and (2b), 


All told there are Uf conditions of the type (4b)’. 


(5) Finally, for each region we have abalance of trade condition: 


p> gid = pL. ) go + ol e O. (5) 


J J 
JAL JA 


Over all regions, there are U such conditions of which only U-1 are in- 
dependent. 


All told there are U/[ntm+U]+U-2 variables whose values are to be 
determined and that many plus one conditions for their determination. 
However, one of the latter can be shown to be redundant.’ Hence, the 
count of unknowns and conditions is the same. 


4. TRANSPORTATION AND ASSET TRANSFERS 


J 


‘ , , Ll 
It is to be noted that the shipment variable ¢, and the asset 


transfer variable Q!’ cannot directly be determined as can the other 
variables of the system once an equilibrium set of prices is given. 
E.g. for such prices, we determine directly an equilibrium input-output 
schedule for each firm from (la) and (1b). But we only know directly 


that when a st = O, the corresponding shipment is non-negative (al- 


L~ 


though when a 7)’ J <0, the corresponding shipment is zero), However, an 


equilibrium set of shipments and asset transfers can be determined in- 
directly via a computation somewhat related to linear programming. 


For an equilibrium set of prices (zero price commodities excluded 





TE. g. See Isard, op.cit., pp 45-46; R.G.D.Allen, MATHEMATICAL ECONOMICS, 
New York, 1956, pp. 321-322; and R. Dorfman, P.A. Samuelson, and R.M. 
Solow, LINEAR PROGRAMMING AND ECONOMIC ANALYSIS, New York, 1958, p.354. 
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from the model ) ,® = = 0, and ) | a F and ) } Yk, ;can he determined 
: 


J 
from relations (la), (1b), (2a), and (2b), From (4a), the net export (or 
import) of h is then obtained for each region. (Note that cross-haul- 
ing is inconsistent with a maximum value for G, as wil] be evident be- 
low, and that shipments from one region to a second which happen to 
pass through a third are by definition not recorded in the export and 
import data of the third), Further, from (3b) we have: 





L~J L 


+s = p! -p’- ph qiJ w 


By substitution into (3a): 


- Bae ee cies > pf i > pkr deS(glJ, w) (6) 
L J . = Se 


L¥J L¥J L¥J 


ot ™ > gimJ and gl ~ ) givs (7) 


Then (6) becomes: 


Gs ) pr.od = pil ‘S > pe? qi7J (gL w) (8) 
J L L J 


L#J 


But for agiven set of equilibrium prices, the equilibrium o (the total 
import vector of J, J=l,...,U) and g' (the total export vector of L, L= 
1,...,U) are determined as above. Hence, a maximum G must correspond 
to a minimum 





8it is appreciated that one cannot adequately specify beforehand which 
commodities wili be zero price and can be excluded from the framework 


of a model. And it may well be that at one “low-level” equilibrium 

point a good such as water may be a “free” good while at a second “high- 

level” equilibrium point water may have a positive price. Similar dif- 
q P y if F 


ficulties are encountered when one attempts to specify beforehand those 
commodities for which x; > 0, or for which ¥; # O so that equalities 


(2b) and (1b), respectively, obtain at equilibrium. In this sense our 
model is not truly general. 











NO. 
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> ) phe hd (gk ow), 


L J 
L4J 


i.e. to the minimum for total transportation costs when the equilibrium 
levels of total imports and exports by regions are specified. But this 
is a problem related to the typical Koopmans-Hitchcock transportation 
problem. Since for any given set of equilibrium prices we know in our 
problem those shipments which are unprofitable (rL“JS < 0), and hence 
those shipment variables which canhbedefinitely excluded from a “basic” 
solution, a simplified version of a linear programming computation will 
suffice to yield an optimal (equilibrium) pattern of shipments. ° 


Once a pattern of equilibrium gS is found, the determination of 


the asset transfer variable gl can be direct. 


It should also be observed that since at equilibrium ) ahs O 
L 


for an interregional economy where the transport rate is positive, /° 


from (4a)’ 
2 oe +) | ye - } . xb = ) sil - > obs (9) 
I I L l 


But, from (4b) and (7) 
> shi -) ok: = > ) ? i= . =) (10) 
J 


L#J 
Therefore 


; } th, + ) yee ) xb, > ) ql (ge w) (11) 
L L L » 2 


L4J 


i.e. at aninterregional equilibrium, the available world supply of trans- 
port services after all consumers and producers have heen furnished with 
their internal (intra-regional ) requirements of transport services must 
be exactly equal to the transport requirements to effect a minimum cost 
interregional shipment program which corresponds to a maximum value for G. 





EL g. see P.A. Riiediiein, “Spatial Price Equilibrium and Linear Pro- 
gramming, ’’ AMERICAN ECONOMIC REVIEW, Vol. XLII, June, 1952, pp.283-303. 


10Note that if at equilibrium the transport rate is positive in any one 
region, it must be positive in all regions. 
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COMMUNITY INCOME MULTIPLIERS: 
A POPULATION GROWTH MODEL? 


by Charles M. Tiebout* 


1, INTRODUCTION 


The economic base of a region has been discussed at both the theo- 
retical and the empirical level. At thetheoretical level, some efforts 
have been directed at an integration of base theory with more general 
economic theory,either foreign trade mltiplier or input-output models. ! 
At the empirical level, data limitations have forced most studies to 
use employment as the magnitude to be investigated.? It is desirable, 
however, to set up an empirical model measuring income in dollars. 





t 

I am grateful for the comments of Charles Leven, Robert Strotz, Daniel 
Suits, and especially to the late Stefan Valavanis. The errors are my 
responsibility. 


> 
The author is Professor of Economics at the University of California, 
Los Angeles. 


lSee W. R. Pfouts and Earle Curtis, “Limitations of the Economic Base 
Analysis,’ SOCIAL FORCES, XXXVI (May,1958), 304-10; Charles M. Tiebout, 
“Input-Output versus Foreign Trade Multiplier Models in Urban Research,’ 
JOURNAL OF THE AMERICAN INSTITUTE OF PLANNERS, XXIII No.3 (1957),24-29. 
Research in this area also includes the studies of Isard and Kuenne, 
Moore and Petersen, Hildebrand and Mace and is concerned with the ef- 
fects on one region’s employment (income) of achange in exports. [Walter 
Isard and Robert Kuenne, “The Impact of Steel Upon the Greater New York- 
Philadelphia Urban-Industrial Region, ‘‘ REVIEW OF ECONOMICS AND STATISTICS 
XXXV (November, 1953), 289-301; Fredrick Moore and James Petersen, 
“Regional Analysis: An Interindustry Model of Utah,” REVIEW OF ECONOMICS 
AND STATISTICS, XXXVII (November, 1955), 368-80; George Hildebrand and 


Arthur Mace, “The Employment Multiplier in an Expanding Industrial 
Market: Los Angeles County, 1940-47," REVIEW OF ECONOMICS AND STATISTICS, 
XXXII (August, 1950), 241-49. Moses in an interregional study is con- 


cerned with the effect of a change in final demand on a set of regions. 
Leon Moses, “Interregional Input-Output Analysis, ’’ AMERICAN ECONOMIC 
REVIEW, XLV (December, 1956), 803-32.) 


2References may be found in the series by Richard Andrews, “The Mechanics 
of the Urban Economic Base,’ LAND ECONOMICS, XXIX, (1953), continuing 
series. One study in dollar income--really value added--is that of 


Charles Leven, “The Theory of Social Accounts,’’ PAPERS AND PROCEEDINGS, 
REGIONAL SCIENCE ASSOCIATION, IV, (1958), 221-38. 
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This paper will develop an income model similar in spirit to an 
economic base study. The model handles a specific form of regional in- 
come change, where increased income is brought about by an increase in 
population. A population income model is of current interest because 
metropolitan regions and their urban and suburban components have been 
experiencing accelerated population growth during the past decade, It 
is this rapid growth which has given rise to what many people consider 
to be “problem areas.’’ Field studies in two suburban communities will 
allow us to fit a set of income and expenditure data to this model in 
order to gain some idea of the magnitudes involved. 


2. A COMMUNITY INCOME MODEL 


The income aggregate to be investigated is defined as “income ac- 
cruing to residents of the community.” This aggregate is analogous to 
personal income for the nation. By choosing this measure some of the 
problems involved in deriving a concept such as “community net product” 
are avoided, 3 


The simple mode] (base) to be used is of a Keynesian variety: 


- 1) 
Y = Y, + Y, ( 

C) = gY (2) 
Y, = HC] (3) 
Y, = exogenously set (4) 


In (1) the total income accruing to residents is defined as the sum of 
the exogenous income (Y,) and endogenous income (Y,). In the empirical 


application, income originating in the retail and service sector as a 
result of sales to local residents plus local government is considered 
endogenous income. Exogenous income includes income originating in ex- 
ports. (In the communities under consideration contract construction-- 
investment income--carried out by local firms was negligible.) Equation 
(2) notes that consumption expenditures on local goods is a function of 
local income, and g is the propensity to consume local goods. This 
recognizes as leakages a propensity to import, pay taxes, and save. 
Equation (3) notes that only a fraction (h) of the expenditures made 
locally create income to local residents, where h is the local income 
created per dollar of sales by local activities. Part of local expend- 
itures are immediately leaked to other regions; e.g., to wholesalers, 
wages to nonresident employees, and so forth. Exogenous income (4) is, 
by definition, given. 


In terms of this model consider two special ways by which an in- 
crease (decrease) in regional income may come about: (1) the income of 
all residents of a community increases (decreases) by some marginal 





3For a discussion of some of the conceptual problems involved in re- 
gional income accounting see: Werner Hochwald, “Conceptual Issues in 
Regional Income Estimation, ’’ in REGIONAL INCOME: STUDIES IN INCOME AND 
WEALTH, XXI (Princeton: Princeton University Press, 1957), 9-26; Leven, 
op. cit. 
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amount; it may be called a “per-capita income shock.” (2) A family 
with, say, 10,000 dollars of dividend income moves into (out of) the 
community. This will be termed a “population-income shock.” As noted, 
our interest is in the latter source of income change. 


We turn to the assumptions underlying this model: 


1. Population initiates and responds to economic changes in the 
local economy. 


2. External multiplier feedbacks are ignored. We assume that, 
given an autonomous change in aconmunity’s income, the resulting changes 
of income in neighboring regions are negligible, and the echo of the 
latter upon the income of the community under consideration even more 
negligible. This is analogous to the smal] nation case in international 
trade. 


3. The two communities chosen for study allow us to assume that 
the “indirect effect’--the interindustry effect among manufactures and 
wholesaling--is negligible.* This limitation is conducive to our ob- 
jective of studying just the consumption-multiplier in a community. 


The available data are for one year. As a consequence, four fur- 
ther assumptions are made: 


4. The changes in the level of exogenous variables considered 
must be small. Large changes which give rise to structural shifts are 
not handled; e.g., a population boom which occasions a whole new shop- 
ping center. If our region were a whole metropolitan area, structural 
shifts within the region would be less of a problem. 


5. The empirical observations, taken at one point in time, are 
assumed to represent anear equilibrium situation. In line with assump- 
tion four, the possibility that the data represent a community which 
has not fully adjusted to some exogenous change is eliminated. For the 
communities studied this is a reasonable assumption. 


6. The spending patterns of new residents will be similar to those 
of established residents. This assumption wi!l be modified below. 





4an increase inan exogenous activity, say exports, “directly” increases 
the income of those employed in exports. It also increases the demand 
by the export industries for inputs from other firms within the region 
and, in turn, increases the income of households associated with these 
industries. This increase in income,brought about by the inter-industry 
effect, in standard terminology is called the “indirect effect.’ 


Given the increase in income directly and indirectly, a third component 
of income change is brought about by increased household consumption. 
It is the consumption-function multiplier’s contribution to the increased 
income of the region. There are certain regions where the indirect ef- 
fect is small. By analyzing them we can get fair estimates of the re- 
gional consumption multiplier. 











78 JOURNAL OF REGIONAL SCIENCE, VOL. 2, NO. 1, 1960 





7. Since only one year’s observations are available, empirical 
estimates of g and h represent average propensities and not the marginal 
propensities usually associated with multiplier models. It is assumed 
that g and h are constant for all levels of income and consumption re- 
spectively. That is, functions (2) and (3) pass through the origin. 
This assumption requires some discussion. 


The assumption that g, the propensity to consume local goods, is 
constant for all levels of income is consistent with assumptions one 
and six, the zero value of consumption at the origin merely expressing 
an empty community. If a new family moves in with an income equal to 
the average of the established residents and has a spending pattern 
similar to theirs, it is the average propensity to consume which is im- 
portant in the first round of spending. And in communities where leak- 
ages are large--and multiplier values low relative to national multi- 
pliers--it is the first round of spending which is important.’ The 
exogenous change is not a small increment of income to all families, 
but a large increment to one family.® 


The parameter h, the local income created per dollar of sales by 
local activities, is also assumed to be constant. National data suggest 
that income created in retail and service activities is proportional to 
sales. This assumption is consistent with the objectives of the short- 
run analysis considered here in that it assumes proportionate changes 
in material and labor requirements. 


Equations (2) and (3) may be rewritten as: 


Yow (2a) 


n 


‘ 


Hence, b is interpreted as the “income creating local propensity to 
” 


consume. It is, of course, simply the product gh. If Y, and Y, are 


known, b can be determined. Also, subject to the assumptions noted 
above, 1 - b is the reciprocal of the community income multiplier of 
the population-income model. ' 





5 ; ae 
It is possible to estimate some marginal propensities from the cross 
section data in each community. This will not be done here. 


For this reason, national models dealing with exogenous changes where 
the impact is on spending units usually specify how the initial impact 
is created. Are all families given marginal increments or do some 
families (the unemployed) receive large increments? In other words, is 
the distribution of income constant? 


‘This follows from the substitution of equation (2a) in (1), thus: 


x 
or Y(l1 - b) 2 Y 
x 





therefore, Y = 
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3. FIELD STUDIES 


The two Chicago suburban communities of Evanston (population 72,000) 
and Winnetka (population 13,000), Illinois, are located along the high 
income North Shore. Survey estimates place mean spending unit incomes 
at 9,490 dollars for Evanston and 18,400 dollars for Winnetka in the 
year 1954,8 


A random sample survey of spending units in each community was 
conducted in order to estimate incomes, the geographic sources of in- 
comes, some expenditure patterns, and other information. Personal in- 
terviews were conducted with 169 spending units in Evanston and 139 in 
Winnetka. The details of the survey have been considered elsewhere. ? 


Other information was drawn from surveys in the business sectors, 
1954 Census of Business, Illinois Retailers Occupation Tax data, and 
from local government expenditures. 


4. INCOME MULTIPLIER ESTIMATES 


Data for various categories of income and their sources are pre- 
sented in Table 1. Column headings indicate whether the entry is as- 
sumed to be exogenous or endogenous. Note that government expenditures 
for this model are assumed to be endogenous. 


Total 1954 incomes, estimated from the survey of spending units, 
are Winnetka, 65.1 million, and Evanston, 228.0 million. The surveys 
also est.mate income earned inside the communities as Winnetka 4,2, and 
Evanston 53,0 million. Table 1 presents estimates of income earned in- 
side the community from the nonspending units survey sources noted above. 
The statistical discrepancy represents the differences in the two 
sources, 


The key data in Table 1 are the 3,3 and 19.8 million of endogenous 
income out of a total of 65.1 and 228.0.1° The b values of equation 
(2a) are then equal to .051 and .087 for Winnetka and Evanston; and the 
multiplier values are Winnetka, 1.054 and Evanston, 1,096. 





Spending units are defined as people who pool their income for most 
purchases. In the case of these two communities it is just about the 


equivalent of family income. 


For greater details on the survey and the estimates presented below 
see, Charles M. Tiebout, THE COMMUNITY INCOME MULTIPLIER: AN EMPIRICAI 
STUDY, (Unpublished Ph.D. dissertation), University of Michigan, Ann 
Arbor, 1957. 


10 , , 
Assuming the population figures are correct, the true total Evanston 
income is within 19 million of 228 million, with 95 per cent probabili- 


ty. This is based on the standard error of the arithmetic mean. For 
Winnetka the true total income is within 2 million of 65 million, with 
95 per cent probability. Figures after the decimal are not significant 


in estimating total income. 
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TABLE | 

















INCOME SOURCE INCOME OF EVANSTON RESIDENTS, | INCOME OF WINNETKA RESIDENTS, 
BY SOURCES: 1954 | BY SOURCES 1954 
(MILLIONS OF DOLLARS) | (MILLIONS OF DOLLARS) 
T 
TOTAL INCOME ACCRUING Income from Exo-|Income from Endo-| Income from Exo-| Income from Endo- 
[0 RESIDENTS genous Sources | genous Sources genous Sources genous Sources 
(By Assumption)|(By Assumption) } (By Assumption)| (By Assumption) 
EnpeinEeRR . 4 } + _ 
Income earned outside } 
Community 175.0 60.9 
Income Earned Inside Community 
by Sources 
Government Sector 
Local Government (including 
schools) 4.2 0.7 
Federal Government (Post 
Of fice) 1.¢ 0.1 
Retail Sector 
Wages paid resident employees 0.2 0.6 
Food Stores | 
Originating from sales | 1.0 
to residents. 
Originating from sales 0.1 1.0 
to non-residents. 
Other retail activity 
Originating from sales | 
to residents. 3.7 
Originating from sales 
to non-residents. 3.8 
Income to resident owners 0.1 | 0.5 
Food stores 
Originating from sales neg. 
to residents. ' 
Originating from sales neg. 
to non-residents. 
Other retail activity 
Originating from sales 2 
to residents. | 
Originating from sales 0 
to non-residents. 
Wholesale Sector | 
Wages paid resident employees | 
Originating from sales 0.2 | 
to residents. | | 
Originating from sales | 3.6 | 
to non-residents. | | 
Income to resident owners | 
Originating from sales | | 
to residents. neg. 
Originating from sales | | 
to non-residents | 0.4 | 
Manufacturing Sector | | 
Wages paid resident employees | 
Originating from sales | } | | 
to non-residents. | 8.0 | | 
Income to resident owners | | 
Originating from sales | j 
to non-residents e 
Service Sector 
Wages paid resident employees } | 0.4° 1.0° 
Covered employment | 
Originating from sales } | 
to residents. | 1.2 | 
Originating from sales | | 
to non-residents. 1.4 
University employment | - } 
Rental Income | 2.6 | | 


Medical care | 
Originating from sales | | 
to residents. | + 
Originating from sales 
to non-residents. } 
Other service employment 
Originating from sales | 
to residents. | } 0.6 
Originating from sales | | 
to non-residents. 
Statistical discrepancy 1.4 





TOTALS 1U8.2 














*Includes income of self employed. 
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An alternative method of deriving the estimates of b is to estimate 
g and h separately--recalling that b = gh. The average propensity to 
consume local goods and services (including governments), g, may be es- 
timated from total incomes (65.] and 228.0) and the service and retail 
sales made to local residents (18.8 and 114.0).!! The values are, By = 


-29 (Winnetka); and g, = .50, (Evanston), The parameter h represents 


the income created per dollar of sales. Thus, the ratio of total in- 
come earned inside the community originating in these activities to 
total retail, service and government sales gives the estimates, hy = .156 


and h, = .163./2 Computing the multiplier values byk = 1/1 - gh, yields 


Winnetka, 1.047 and Evanston, 1.089. The difference between these 
values and those given above represents a statistical discrepancy. 


One way to interpret the results is as follows. Holding constant 
such variables as age, number of children, and so forth, suppose a new 
spending unit with an income equal to the average of established resi- 
dents moves into Evanston and provides the exogenous change. (Assume 
the head of the spending unit commutes to Chicago.) The initial in- 
crease in income will be 9,490 dollars. Using the first values of the 
multiplier, the total increase in income accruing to residents will be 
10,401 dollars. Likewise, an 18,400 dollars a year family moving into 
Winnetka generates a total addition of 19,394 dollars. 


The empirical results, which are admittedly subject to large er- 
rors, are at least consistent with a priori notions of multiplier val- 
ues. Evanston is a larger and more self contained community. In con- 
sequence one expects, and is pleased to find, that g and h are higher 
than for Winnetka.!3 It is important to note that the communities 


studied constitute only a small portion of a metropolitan region. If 


the whole metropolitan region is the area under analysis, both g and h 
will be substantially larger. 





11total retail and service sales are given by the 1954 Census of Busi- 
ness. Surveys inthe business sector enable us to estimate the per cent 
sold to residents. 


121y computing both g and h for Evanston the total income earned inside 
the community was reduced by 15.8 million in order to exclude manufac- 
turing, the university, and a home office of a national insurance com- 
pany income. These, clearly, do not enter into local consumption func- 
tion estimates. Government spendings were entered as the wages and 
salaries paid local employees. 


134 minimal estimate of a community income multiplier for Ann Arbor, 
Michigan (population some 40,000 excluding college students) is of the 
same magnitude as Evanston. This is again consistent with a priori 
notions that the larger size of Evanston is offset by the relative geo- 
graphic isolation of Ann Arbor. 
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5. LOCAL CONSUMPTION: INCOMES AND YEARS OF RESIDENCE 


Empirical evidence taken from these studies, and from studies in 
Ann Arbor, Michigan, and Lake Forest, Illinois, provide some additional 
information on the propensity to consume local goods: (1) with increas- 
ed income the proportion spent outside the community increases; and (2) 
newer residents buy relatively more outside the community. 14 We can 
thus elaborate assumptions six and seven. 


Table 2 presents the distribution of expenditures on durable goods 
inside and outside the community for different income brackets. The 
total propensity to consume of residents is given as the propensity to 
consume local goods (g) plus a propensity to import (m). If total ex- 
penditures of residents behaves in the same manner as durables, Table 2 
suggests that the ratio of g/m falls as income increases, i.€., import- 
ing becomes relatively more important.!° Thus, the higher the income 
of the new resident, the lower the value of the multiplier. This, then, 
results from two factors: (1) as the data suggest, relatively greater 
expenditures on imports (m); and (2) as usually assumed, a smaller pro- 
pensity to consume (g + m), 


The assumption that new residents will have spending patterns 
similar to established residents is open to question. The expenditures 
connected with moving (establishing the domicile) as wel] as demographic 
variables might give rise to different expenditure patterns. Beyond 
these qualifications, our study provides some findings relevant to as- 
sumption six. Table 3 presents data on the purchase of durables by 
years of residence inside the community. (Note that spending units 
with less than one year’s residence are excluded.) Newer residents 
tend to purchase a greater percentage of their durables outside the 
community. Moreover, this relationship holds with incomes held con- 
stant.!© If new residents have a total propensity to consume (g + m) 
similar to established residents, it implies g mst be relatively smal]- 
er for newer residents. In turn, the sultialior will be lower than an 
estimate based on the average spending patterns of all residents. 





l4rurther detail concerning the Ann Arbor survey are found in my THE 
COMMUNITY..., Op. cit. The Lake Forest survey is discussed in a mono- 
graph, Raymond Murphy and Charles Tiebout, THE LAKE FOREST-LAKE BLUFF 
ECONOMY; A SURVEY OF INCOMES AND EXPENDITURES, 1956. In Tables 2 and 3 
Winnetka is omitted because of its small size. No meaningful division 
of expenditures inside and outside of the community is possible since 
sO many spending units purchased outside the community. 


15 


Spending units were also asked where they purchased clothing. The 
same patterns given for durables in Tables 2 and 3 repeat for clothing. 


The greater purchases outside the community by higher income spending 
units probably reflects: (1) the more specialized items purchased; and 
(2) the increased mobility of the housewife associated with more second 
cars. 


l6The effect might be accounted for by greater association with downtown 
stores:(1) by new suburban residents who moved out from the central city; 
or (2) new residents who, at first, are only acquainted with downtown 
stores: Fields,Carson Pirie and Scott in Chicago, and Hudsons in Detroit. 
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TABLE 2 - Distribution of Durable Purchases: By Income Brackets* 








Per cent of 

Spending Units 
Making Purchase 
Inside the Com- 


Per cent of 

Spending Units 
Making Purchase 
Outside of the 











Community and Income Bracket munity Only Communit y 

Ann Arbor, Michigan 

0 - $4,999 56% 44% 
$5,000 - $9,999 35% 65% 
over $10,000 38% 62% 
Evanston, Illinois pO 

0 - $4,999 71% 2K 
$5,000 - $9,999 29% 71% 
over $10,000 18% 82% 
Lake Forest, Illinois | 

0 - $5,999 | 18% 82% 
$6,000 - $14,999 6% 94% 
over $15,000 5% 95% 





*Durables are defined inapproximately the same manner as in the Surveys 


of Consumers Finance, (see, 


cluded are such items as: 


radios, 


FEDERAL RESERVE BULLETIN, 


June issues), In- 


television, furniture, washers, car- 
pets, and so forth. Due to the small number of respondents purchasing 
automobiles, they are not included in this classification. 





TABLE 3 - Distribution of Durable Purchases: 





Purchased one 
or more Durables 


By Years of Residence* 


Purchased one 
or more Durables 











Inside the Outside the 
Community and Years of Residence | Community Communit y 

Ann Arbor, Michigan 

One to 9 years 33% 7% 

9 or more years 49% 51% 
Evanston, Illinois 

One to 9 years 19% 81% 

9 or more years 31% 69% 
Lake Forest, Illinois 

One to 9 years 6% 94% 

9 or more years 13% 87% 





*Durable classification, 


see Tabl 


| 
$ 
e ? 


as 
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6. CONCLUSIONS 


Subject to the usual qualifications and reservations these find- 
ings show multiplier values strikingly lower than national multipliers. 
Further, they are substantially lower than the values estimated in 
other regional studies. In part, of course, this merely reflects dif- 
ferences in the size of the region consjdered--and will vary with the 
particular form of the mltiplier used.! 





171, part this stems from the choice of income accruing to residents as 
the social aggregate. If some concept such as income accruing to resi- 
dents and those employed in the community were used,the wages paid non- 
residents would not be leakages. Thus, the multiplier ‘values would be 
higher. Further, data shortages force many regional studies to use the 
“residuals method’? to estimate leakages. In Wisconsin, for example, 
this method would note that domestic automobile production (American 
Motors) exceeds domestic consumption. The difference, therefore, is as- 
sumed to represent net exports. In turn, no automobile imports (leak- 
ages) are assumed. This method, because of product mix problems of 
this type, understates the leakages, i.e., overstates the multiplier. 
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